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1 .  INTRODUCTION 


This  Final  Technical  Report  describes  the  research  and  development  results  of  the  Neural 
Network  Communications  Signal  Processing  (NNCSP)  Program,  contract  number 
F30602-92-C-0051.  The  objectives  of  the  NNCSP  Program  are  to:  1)  develop  and 
implement  a  neural  network  and  communications  signal  processing  simulation  system  for 
the  purpose  of  exploring  the  applicability  of  neural  network  technology  to  communications 
signal  processing,  2)  demonstrate  several  configurations  of  the  simidadon  to  illustrate  the 
system's  ability  to  noodel  many  types  of  neural  network  based  communication  systems,  and 
3)  use  the  simulation  to  identify  the  neural  network  configurations  to  be  included  in  the 
conceptual  design  of  a  neural  network  transceiver  that  will  be  developed  in  a  phase  11 
foUow-on  program. 

1.1  BACKGROUND 

Possible  application  areas  for  neural  network  technology  in  the  communication  system 
domain  are  the  signal  processing  functions  of  transceivers  which  include  noise  cancellation, 
demodulation,  decoding  and  ch^el  equalization.  A  modular  design  approach  that  couples 
neural  network  modules  with  conventional  signal  processing  modules  has  the  potentid  of 
producing  a  sman  radio  that  exhibits  a  flexible  design  and  eidiances  link  survivability  in  an 
electronically  hostile  environment.  The  current  status  of  neural  networks  and  their 
application  to  the  communication  system  domain  remains  in  a  state  of  basic  research.  This 
basic  research  has  resulted  in  several  papers  with  general  theory  and  some  pieliminary 
results  but  very  little  else.  These  papers  provide  an  initial  baseline  and  provide  motivation 
for  obtaining  a  computer  aided  design  system  for  the  creation  of  neural  network  based 
communication  systems.  Ongoing  work  for  the  "Speakeasy"  multiband,  multinKxle  radio 
program  and  the  smart  radio  ^velopment  program  at  Rome  Laboratory  requires  suppen  in 
the  application  areas  mentioned  above.  Neural  networks  may  be  able  to  provide 
communication  systems  with  a  great  deal  of  processing  power  while  at  the  same  time 
providing  a  degr^  of  fault  tolerance.  Previous  efforts  have  addressed  small  aspects  of 
communication  systems  using  neural  network  ^hnology  but  an  overall  simulation  system 
that  compares  neural  network  techniques  with  conventional  signal  processing  techniques  is 
unique  to  this  effort.  The  successful  completion  of  this  effort  significantly  enhances  tlie 
state  of  the  art  in  design  capability  for  smart  radio  technology. 

1.2  SCOPE 

This  program  represents  the  first  of  two  developmental  phases.  In  the  first  phase,  the 
contract^'  designs  and  implements  a  software  simulation  environment  that  will  ^  used  for 
modeling  neu^  network  based  communications  systems.  In  the  second  phase,  the 
contractor  will  fabricate  a  breadboard  transceiver  based  on  the  results  of  the  phase  I  effort 
This  contract  only  addresses  die  phase  I  neural  network  communications  signal  processing 
simulation,  and  the  remainder  of  this  description  of  the  technical  scope  adc^sses  only  the 
tasks  of  phase  I. 

The  Hrst  task  of  the  NNCSP  program  was  a  feasibility  study  to  determine  which  Neural 
Network  paradigms  can  best  be  applied  to  the  communication  domain.  The  results  of  this 
feasibility  study  fenmed  the  basis  for  specifying  the  neural  network  modules  in  the  software 
simulation  environment  The  next  task  was  the  design  and  development  of  a  software 
package  which  provides  the  capability  for  communication  engineers  to  design  and  test 
communication  systems  that  contain  modules  of  neural  network  algorithms  and  also 
modules  of  conventional  signal  processing  techniques.  The  simulation  software  provides 
the  capability  to  interchange  neural  network  modules  with  similar  conventional  signal 
processing  modules.  When  the  simulation  software  was  completed,  the  next  task  was  to 
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demonstrate  several  configurations  of  the  simulation  to  illustrate  the  system's  ability  to 
model  many  types  of  neui^  network  based  communication  systems.  The  following  task 
consisted  of  a  series  of  simulation  experiments  aimed  at  identifying  neural  network  based 
communications  signal  processing  functions  which  should  be  included  in  future  neural 
network  based  transceivers.  As  part  of  phase  I,  the  last  task  was  a  high-level  conceptual 
design  of  a  prototype  neural  network  communication  system  which  is  recommended  for  a 
phase  n  implementation  .  Based  on  its  review  of  phase  I,  the  Government  will  decide 
whether  to  proceed  with  a  procurement  of  a  phase  n  effort 

The  software  package  that  was  developed  and  implemented  in  this  contract  is  named  the 
Neural  Network  Commuiucations  Simulation  System  G^CSS).  It  is  a  communications- 
oriented  digital  signal  processing  (DSP)  simulator  with  the  capability  to  invoke  neural 
network  par^gim  into  signal  processing  chains.  This  is  a  generic  tool  which  will  greatly 
facilitate  the  design  and  simulation  of  communication  signal  processing  products  that 
incorporate  embedded  neural  network  teclmology. 

The  NN(^S  is  based  on  the  Signal  Processing  WorkSystem™  (SPW™),  a  commercial 
off  the  shelf  (COTS)  signal  processing  simulator,  offered  by  Comdisco  Systems,  Inc. 
SPW  provides  a  block  diagram  approach  to  constructing  signal  processing  simulations. 
The  entire  ptoto^ing  effort  is  viewed  in  schematic  form  on  the  workstation  monitor, 
providing  a  familUa'  engmeering-oriented  paradigm  for  the  analyst  Using  SPW,  the  analyst 
attaches  special  functitm  signal  processing  tiKxlities  to  each  other,  and  provides  input  signal 
sources,  which  may  be  either  (tigital  data  retrieved  from  disk  or  real-time  analog  signals 
digitized  "on  the  fly"  as  the  simulation  runs.  The  analyst  has  the  capability  to  insert 
"probes"  at  various  points  in  the  signal  processing  chain  and  view  signal  characteristics  at 
the  probed  points.  Such  probes  may  collect  data  for  real-time  viewing  or  post  simulation 
analysis. 

In  this  contract,  additional  function  blocks  and  codes  have  been  added  to  SPW  to  allow  the 
design  and  simulation  of  neural  network  based  communication  functions.  With  the 
functions  provided  by  the  Neural  Network  Communications  Library,  the  NNC^S  allows 
the  analyst  to  compare  die  performance  achieved  by  neural  network  based  designs  with  the 
performance  of  siimlar  fun^ons  which  use  conventional  signal  processing  approaches. 

1.3  REPORT  ORGANIZATION 

Section  2  presents  the  NNCSS.  Speciflcally,  the  feasibility  study  which  defined  the 
NNCSS  is  summarized  in  2.1,and  the  NNCSS  architecture  is  described  in  2.2.  Section  3 
describes  the  Neural  Netwnk  Gxnmunications  Library  (NNCL)  which  is  the  library  of 
neural  network  function  blocks  used  within  die  NNCSS  environment  Section  4  describes 
all  of  the  simulation  experiments  which  were  used  to  investigate  the  q>plication  of  neural 
networks  in  communications  signal  processing.  Section  S  presents  the  high-level 
conceptual  design  of  a  neural  network  transceiver  which  could  be  developed  in  a  phase-n 
follow-on  program,.  Section  6  presents  the  conclusions  of  the  program  and 
tecommendr^ons  for  additional  wcnk  to  be  done. 

1.4  REFERENCE  DOCUMENTS 
1.4.1  Government  Documents 

The  following  documents  define  the  contractual  requirements  for  the  Neural  Network 
Ck)mmunications  Signal  Processing  Program 

DOD-STD-2167A,  Defense  System  Software  Development,  29  February  1988. 
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DI-MISC-80711/T,  Data  Item  Description  (DID).for  Scientific  and  Technical 
Repeats  (Final). 

F30602-92-R-(X)05,  Request  fra*  Proposal,  Neural  Net  Communications  Signal 
Processing,  Rome  Laboratory,  Griffiss  Air  Force  Base. 

F306Q2-92-C-00S1,  Contract,  Neural  Network  Communications  Signal  Processing 
Program,  Rome  Labenatory,  Griffiss  Air  Force  Base. 

1.4.2  Non-government  Documents 

The  following  documents  are  contractual  data  requirements  of  the  Neural  Network 
Communications  Si^ial  Processing  Program: 

Feasibility  Study,  Technical  Information  Report  for  the  Neural  Network 
Communications  Signal  Processing  Ihx>gram,  CD^  A003, 31  March  1993. 

Software  Development  Plan  for  the  Neural  Network  Communications  Signal 
Processing  Program,  CDRL  A(X)5, 1  March  1993. 

System/Segment  Design  Document  for  the  Neural  Network  Communications 
Simulation  System,  6  June  1994. 

Software  Design  Document  for  the  Neural  Network  Communications  Library, 
CDRLA004,6June  1994. 

Software  Design  Document  fev  the  Neural  Network  Object  Manager,  (!DRL  A0()4, 
6  June  1994. 

Software  Test  Description  for  the  Neural  Network  Communications  Simulation 
System,  CDRL  A006, 6  June  1994. 

Software  Test  Report  fex*  the  Neural  Network  Communications  Simulation  System, 
CDRL  AOerZ,  6  June  1994. 

Software  Users  Manual  for  the  Neural  Network  Communications  Simulation 
System,  (TDRL  A008, 7  June  1994. 


The  following  documents  define  the  Signal  Processing  WorkSystem™  that  comprises  the 
commercial  off  the  shelf  (COTS)  compute  software  configuration  items  (CSCI)  that  are 
part  of  the  Neural  Network  Communications  SimulaticHfi  System: 

SPW™  -  The  DSP  Framework™  User's  Guide  and  Tutorial,  Product  Number: 
SPW8010,  Document  Version:  3.0,  Comdisco  Systems,  Inc.,  September  1992. 

SPW™  -  The  DSP  Framework™  Macro  Command  Language  Reference,  Product 
Number:  SPW8011,  Document  Version:  3.0,  Comdisco  Systems,  Inc.,  September 
1992. 

SPW™  -  The  DSP  Framework™  Designer/BDE™  User's  Guide,  Product 
Number:  SPW8012,  Document  Version:  3.0,  Comdisco  Systems,  Inc.,  ^ptember 
1992. 
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SPW™  -  The  DSP  Framework™  Signal  Calculator™  User’s  Guide,  Product 
Number:  SPW8013,  Document  Version:  3.0,  Comdisco  Systems,  Inc.,  September 
1992. 

SPW™  -  The  DSP  Frameworic™  Signal  Flow  Simulation  User’s  Guide,  Product 
Nr  nber  SPW8014,  Document  Version:  3.0,  Comdisco  Systems,  Inc.,  September 
1992. 

SPW™  -  The  DSP  Framewwk™  DSP  &  BOSS™  Communications  Library 
Reference,  Product  Number:  SPW80iS,  Document  Version:  3.0,  Comdisco 
Systems,  Inc.,  September  1992. 

SPW™  -  The  DSP  Framework™  Tool  Interface  Language  Reference,  Product 
Number  SPW8016,  Document  Version:  3.0,  Comdisco  Systems,  Inc.,  September 
1992. 

SPW™  -  The  DSP  FranaewOTk™  Standard  C  Code  Generation  System™,  Product 
Number:  CGS8000-C,  Document  Version:  1.6,  Comdisco  Systems,  Inc., 
September  1992. 

SPW™  -  The  DSP  Framework™  Interactive  Simulation  Library™  Reference, 
Product  Number  ISL8000,  Document  Version:  1.0,  Comdisco  Systems,  Inc., 
September  1992. 
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2 .  NEURAL  NETWORK  COMMUNICATIONS  SIMULATION  SYSTEM 


The  development  of  the  NNCSS  provides  a  communications-ohented  digital  signal 
processing  (DSP)  simulator  which  smoothly  incorporates  the  capability  to  invoke  neural 
netwcnit  paradigms  into  signal  processing  chains.  This  is  intended  to  be  a  generic  tool 
which  will  greatly  facilitate  the  d^gn  and  simulation  of  communications  signal  processing 
products  that  incorporate  embedded  neural  network  technology. 

The  NNCSS  is  based  on  the  Signal  Processing  WorkSystem™  (SPW™),  a  commercial 
off  the  shelf  (COTS)  signal  processing  simulator,  offered  by  Comdisco  Systems,  Inc. 
SPW  provides  a  block  diagram  approach  to  constructing  signal  processing  configurations 
for  simulation. 

The  users  of  NNCSS  are  expected  to  be  communication  analysts  and  designers  with  a 
detailed  understanding  of  communication  signal  processing  and  only  an  elementary 
understanding  of  neural  network  techniques.  The  NNCSS  will  cast  the  neural  network 
paradigms  into  a  fiuction  block  form  that  will  allow  the  analyst  to  design  and  simulate 
neural  network  based  communication  functions  in  a  manner  similar  to  what  is  currently 
done  using  conventional  design  approaches. 

The  primary  mission  of  the  NNCSS  will  be  to  support  investigations  in  the  design  of 
specific  communication  systems  based  in  part  or  totally  on  neural  network  approaches.  The 
I^CSS  will  allow  comparison  of  neural  network  and  conventional  communication  system 
designs. 

2.1  FEASIBILITY  STUDY 

The  first  task  in  the  NNCSP  Program  has  a  feasibility  study  which  identified  a  feasible 
approach  for  developing  a  sunulation  software  package  for  neural  network  communicatitMis 
signal  processing.  The  feasibility  study  also  select  a  set  of  neural  network  paradigms 
which  could  provide  a  comprehensive  set  of  neural  network  capabilities.  That  part  of  the 
feasibiUty  study  which  dealt  with  die  selection  of  neural  network  paradigms  is  summarized 
in  2.1.  llie  so^are  architecture  is  summarized  in  2.2. 

2.1.1  Neural  Network  Applications  in  Communications 

A  literature  search  was  undertaken  as  a  precursor  to  the  initiation  of  NNCSS  software 
development  This  search  focused  <mi  determining  what  sorts  of  neural  network  paradigms 
have  bMn  successfully  tilled  to  communications  signal  processing  problems.  The  search 
included  the  DIALOG  database  search  system,  and  a  search  of  the  Defense  Technical 
Information  Center  (DTIC).  Several  ptqrers  also  were  found  in  IEEE  periodicals,  including 
papers  ftom  IKKK  Transactions  on  Coomunications. 

A  search  was  performed  using  the  key  words  "NeuralONetwork"  and 
"CommurucationOSystem”  on  the  DIALOG  database  which  contains  articles  from 
published  technical  periodicals.  This  database  produced  98  matches.  Of  the  titles 
observed,  there  were  17  which  ^)peared  to  be  very  relevant  Abstracts  for  these  17  were 
ordered,  and  tlte  full  text  of  13  were  acquired. 

A  search  was  also  performed  on  the  DTIC  using  the  key  words  "neural  nets"  and 
"commuiucatitm  and  radio  systraos".  DTIC  has  a  sp^nfic  giude  which  constrained  the  key 
word  choices.  Nineteen  documents  were  discovered  by  ^is  search;  six  were  of  interest 
and  were  acquired. 
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In  summaiy,  the  literature  search  produced  a  total  of  46  papers  (included  in  the 
Bibliogn^hy,  Section  7.0)  which  api^ared  relevant  to  NNCSS.  These  46  papers  were 
indivi<hially  reviewed,  with  a  short  report  being  generated  fa*  each  paper  which  in  fact  had 
strong  relevance  to  the  expected  applicability  of  the  NNCSS.  Of  the  original  46  papers,  19 
were  of  direct  interest  These  19  are  identified  in  the  following  list, 
list  of  Applicable  Papers 

1.  Aazhang,  B.,  etaL,  "Neural  Networks  fen*  Multiuser  Detection  in  CodeDivision 

Multiple- Access  (Zommumcadons" 

2.  Anderson,  J.,  et  al.,  "Radar  Si^ial  Categorization  Using  a  Neural  Network" 

3.  Andersson,  (3.,  et  al.,  "Generation  of  Soft  Information  in  a  Frequency- 

Hopping  HF  Radio  System  Using  Neural  Networks" 

4.  Chesmore,  EJ).,  "Application  of  Pulse  Processing  Neural  Networks  in 

Cknununications  and  Signal  Demodulation" 

5.  Feiz,  S.,  et  al.,  "Adaptive  ML  Neural  NetwOTk  Based  Receiver  for  (^PSK 

Modulated  Data  Transmission  Systems” 

6.  Fcmtana,  R.,  et  al.,  "Communications  Signal  Recognition  and  Demodulation  via 

Neural  Networks" 

7.  Hussain,  M.,  et  aL,  "Neural  Network  Application  to  Error  Control  Coding" 

8.  Kohonen,  T.,  et  al.,  "(3(Hnbining  Linear  ^ualizadon  and  Self-Oganizing 

Ad2y)tation  in  Dynamic  Discrete  Signal  Detectitm" 

9.  Jeffries,  C.,  "High  Order  Neural  Modeb  for  Error  Correcting  Code" 

10.  Johnson,  J. ,  "Neural  Network  Algorithm  Decoding  and  Sequence  Predictor" 

1 1 .  Kechriotis,  G.,  et  al.,  "Using  Recurrent  Neural  Networks  for  Blind  Equalization 

of  Linear  and  Nonlinear  Communication  Chaiuieb" 

12.  Lee,  T.,  et  al.,  "Adiqrtive  Vector  (Quantization  Using  a  Self-Development  Neural 

Netwmk" 

1 3 .  Naylor,  J.,  "A  Neural  Network  Algorithm  for  Enhancing  Delta  Modulation/LPC 

Tandem  Oxuiections" 

14.  Santamaria,  M.,  et  al.,  "Neural  Net  Filters:  Integrated  Coding  and  Signalling  in 

Communication  Systems" 

15.  Rao,  S.,  et  al., "  A  Neural  Netwenk  Tunable  Filter  for  Multi-Tone  Detection" 

16.  Siu,  S.,  et  aL,  "Decision  Feedback  Equalization  Using  Neural  Network 

Structures" 

17.  Spect,  D.,  "Probabilistic  Neural  Netwc»ks" 

1 8.  Tasic,  J.,  et  al., "  Theocy  and  Application  of  the  Neural  Net-Based  Adaptive  Filter 

in  (Communication  Systems" 

19.  de  Veciana,  G.,  et  al.,  "Neuik  Net-Based  Continuous  Phase  Modulation 

Receivers" 

2.1.2  Neural  Network  Paradigm  Choices  for  NNCSS 

This  section  summarizes  the  evaluation  of  the  applicable  papers  and  the  selection  of  neural 
network  paradigms  for  NNCSS. 

Table  2-1  summarizes  the  19  papers  mentioned  above.  Note  that  the  uses  of  neural 
paradigms  comprised: 

1.  use  of  backpropagation  networks  in  nine  q)plications, 

2.  use  of  associative  recurrent  networks  in  four  applications,  specifrcally  Hopfield  and 

Brain  State  in  a  Box  (BSB), 

3.  use  of  Kohonen  feature  map  structures  in  two  applications. 
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4.  use  of  ART  once, 

6.  use  of  the  Probabilistic  Neural  Network  once,  and 

5.  use  of  two  radier  unique  networks  in  one  application  each,  specifically  Time 

Discriminant  (TD)  and  SPAN. 

The  types  of  communications  applications  to  which  these  networks  were  applied  can  be 
taxonomized  as 

1.  channel  equalization, 

2.  interference  rejection, 

3.  signal  detection, 

4.  multipath  rejection, 

5.  baseband  data  recognition  and/or  contpression, 

6.  error  detecticm  and  correction, 

7.  signal  source  identification,  and 

8.  digital  sequence  prediction. 

Table  2-1  Summary  cf  Applications  Relevant  to  NNCSS 


1  APPUCATION 

— ~ 

PARADIGM 

TRAINING 

PROCEDURE 

recurrent  network 

real  time  recurrent 
learning 

11 

autoadaptive 

12 

■ggjgwggfjnii 

autoadaptive 

13 

Adaptive  filtering 

multilayer 

perceptron 

baclqntpagation 

14 

signal  detectitm 

backpropagation 

1^ 

■IgSJgjglH 

backpropagation 

16 

irjjgjgggrgrgi 

multpath 

correction 

duee-layer  complex 
perceptron 

backpropagation 

18 

CPM  signal 
detectiem 

backpropagation 

19 

It  should  be  noted  that  some  of  the  neural  applications  simultaneously  represented 
applications  in  more  than  one  area,  esp^ally  the  application  papers  dealing  with  signal 
(tetection/channel  equalization  schemes  jointly  earned  out  in  a  single  neural  architecture. 

Of  these  applications,  the  highly  successful  ones  tended  to  be  those  which  replaced 
computatitmally  intense  processes  of  traditional  signal  processing.  In  particular,  the 
applications  in  signal  detectitm/chaiuiel  equalizatimi  and  error  correction,  where  Viterbi 
processes  (mxmally  quite  computationally  complex)  were  rqtlaced  or  augmented  tended  to 
provide  la^e  perfannance  gains,  and  all  of  the  forms  of  signal  detection  tended  to  be  good 
or  adequate,  although  some  of  them  were  not  as  good  as  (computationally  complex) 
optimal  Bayesian  de^on  rules.  Another  strong  point  of  seve^  implications  dealing  with 
channel  impairment  was  the  uiuque  ability  of  tlw  neural  paradigm  to  adapt  to  varying 
signal  envircHiments.  All  ot  these  pipers  documented  work  in  whrch  the  neural  application 
was  found  to  woric  as  well  as  conventional  approaches,  at  least  over  some  practical  range 
of  operation. 

The  application  types  and  networic  types  found  in  the  literature  search  elucidate  the  fact  that 
many  neural  network  paradigms  are  useful  in  signal  processing  applications,  and  in  fact, 
different  neural  paradigms  were  successfully  ippli^  to  the  same  application  area,  as 
illustrated  in  Table  2-1.  It  would  be  desirable  fm  die  NNCSS  to  supj^  a  range  of  neural 
paradigms  which  provide  die  opabUities,  in  terms  of  architectures,  training  techniques,  and 
mathematical  estimation  possibilities,  that  are  found  in  the  neural  literature.  At  Ae  same 
time,  it  is  best  if  the  NNCSS  can  deliver  that  range  of  versatility  with  a  small  collection  of 
well-known  and  well-documented  paradigms. 

Following  this  approach,  the  par^gms  to  be  included  were  determined  by  creating  a 
prioritized  list  of  criteria,  and  finding  a  set  (rf  networks  which  fulfilled  all  of  diose  criteria. 
The  neural  paradigms  were  chosen  satisfying  the  highest  priority  criterion  not  yet  met  by 
prior  selections  with  the  best  unrepresent^  neural  paradigm  meeting  that  criterion.  Of 
course,  in  some  cases,  a  single  paradigm  satisfies  severd  criteria,  and  in  most  cases. 
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several  paradigms  sat^  individual  criteria.  In  this  way  a  relatively  small  collection  of 
neural  paradigms  provide  a  wide  range  of  neural  capabilities  to  the  NNCSS  user. 


Table  2-2  below  provides  a  synopsis  of  the  neural  paradigm  choice  process.  The  rows  of 
the  table  are  indexed  by  the  criteria  of  choice,  ordei^  in  importance,  with  the  topmost  row 
reflecting  the  criterion  deemed  of  greatest  importance.  The  columns  of  the  table  reflect  the 
neural  psuadigms  chosen  to  satisfy  these  criteria,  and  the  marked  row-column  intersections 
show  which  criteria  are  satisfied  by  each  paradignL 

Table  2-2  Choices  of  Neural  Paradigms  for  the  NNCSS 


The  criteria  of  table  2-2  are  elaborated  as  follows: 

1.  the  most  popular  architecture  found  in  applications  is  the  multilayer  peiceptrmi 
structure  trained  using  baclqnopagation; 

2.  the  backpropagadon  learning  algorithm  is  known  to  have  important  theoretical 
properties  and  has  p^ormed  well  in  many  previously  documented  applications; 

3.  several  network  architectures  can  be  trained  using  supervised  training; 

4.  likewise,  several  autoadaptive  networks,  most  notably  the  Kohonen  Self-Organizing 
Feature  Map  (SOFM),  can  be  traiiwd  in  unsupovised  mode; 

5.  recurrent  networks  are  valuable  in  modeling  processes  widi  indefinite  or  infinite 
temptnal  memory,  such  as  the  infinite  impulse  response  filter,  and  the  recurrent 
baclqvq)agation  network  is  useful  in  such  tq)pIications; 

6.  networks  with  variable  topology  permit  die  addition  oS  new  nodes  in  order  to 
accommodate  recognition  problems  where  an  unknown  (or  varying)  number  of  pattern 
exen^lars  are  invdved; 

7.  associative  netwtnks  are  good  at  restoring  partial  patterns,  or  patterns 
corrupted  widi  noise,  and  the  Brain  State  in  a  Box  is  a  versatile  but  fairiy  simple 

architecture  of  diis  type; 
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8.  the  con^titive  learning  law  (e.g.,  in  the  SOFM)  can  create  a  network  which  provides 
equiprobable  exemplars  relative  to  the  probability  density  of  the  input  space,  and  is  thus  a 
vduable  technique  in  analysis  and  estimation; 

9.  one  pass  training  refers  to  network  structures  that  can  be  trained  with  a  single  pass 
through  the  data  (or,  in  some  cases,  by  coni^uting  the  weights  directly  freon  the  known 
probability  of  the  function  to  be  modeled),  provid^g  for  very  rapidly  trained  networks; 


10.  the  netwexks  found  in  the  applications  literature  of  this  study  happen  to  coincide 
with  the  choices  made  by  the  precedmg  nine  criteria,  providing  a  validity  check  that  using 
such  criteria  leads  to  a  range  of  paradigms  which  will  found  in  practice. 

Thus,  the  networks  shown  in  Table  2-2  were,  at  the  conclusion  of  the  feasibility  study, 
those  recommended  for  initial  inclusion  in  the  NNCSS,  and  would  provide  the  requir^ 
versatility  to  address  a  wide  range  of  application  areas. 

During  the  preliminary  de»gn  of  the  NNCSS  it  was  realized  that  the  synthesis  of  neural 
networks  within  a  flexible  signal  processing  simulation  system,  such  as  SPW,  provided 
opportunities  and  capabilities  that  had  not  l^n  envisioned  in  neural  netwoik  simulation 
tools  previously.  With  the  NNCSS,  different  neural  network  paradigms  can  be  combined 
into  more  complex  neural  network  designs  to  create  entirely  different  neural  network 
solutions.  For  exaiiq)le,  the  reset  logic  of  Adjq)tive  Resonance  Theory  (ART)  can  be  used 
to  trigger  state  changes  to  a  recurrent  network  that  is  mapping  the  features  extracted  by  a 
Kohonen  network  to  the  input  of  a  multilayered  backpropagation  netwoik  that  is 
performing  process  control  that  is  effecting  the  original  input  pattern.  Therefore,  the 
development  of  NNCSS  leads  to  the  identification  and  implementation  of  neural  netwoik 
paradigms  that  can  be  ccmfigmtd  with  other  paradigms  to  form  multilayered  architectures. 
Because  of  this  natural  progression,  the  Adaptive  Resonance  Theory  I  (ARTl)  and 
Grossberg  Outstar  paradi^ns  were  added  to  the  NNCSS. 

ARTl  is  the  original  binary  version  of  Ad2q}tive  Resonance  Theory  for  processing  binary 
(0,1)  input  patterns.  This  netwoik  has  proven  itself  as  a  valuable  network  in  data 
compression  and  pattern  recognition.  More  importantly,  in  the  NNCSS,  the  network 
provkies  another  view  of  pattern  features  that  can  be  used  to  stabilize  the  learning  in  other 
neural  networks  such  as  the  recurrent  and  backpropagation  networks.  Likewise  the 
Grossberg  Oitstar  paradi^  provides  a  sinqile  but  gene^  technique  for  retrieving  target 
patterns  from  self  organizing  networks  such  as  the  Kohonen  or  ART  netwoiks.  For 
exanqile,  Hecht-Nielsm’s  Cowterpropag^on  network  combines  two  outstar  layos  with  a 
Kohonen  layer  to  retrieve  target  patterns  in  both  directions  (input  to  ou^ut  and  output  to 
input  m^jpmgs).  Also  the  Fully  Recurrent  Netwoik,  which  ^ically  is  configured  as  a 
single  layer  network,  has  been  modified  to  output  a  backpropagating  residual  error  that 
allows  the  recurrent  layer  to  feedback  to  other  recurrent  or  backprop  layers  in  a  multilayered 
architecture. 

Before  any  software  develqmient  eftfort  had  been  expended  on  the  Probabilistic  Netwoik 
(PN),  consideration  was  given  to  trading  the  PN  for  the  Adaptive  Resonance  Theory  m 
(ART3)  network.  The  prinu^  advantage  of  the  PN  paradigm  over  otho-  neural  networks  is 
Aat  the  learning  is  cooqiaratively  fast,  in  tiiat  die  netwoik  does  not  need  mote  than  one 
through  the  training  daia  to  reach  its  full  ctqiability.  However,  the  Kohonen  Topological 
Feature  Map  as  currently  inqilemented  has  proven  to  be  a  feature  extractor  that  craiverges 
quickly  to  a  stable  feature  mapping  which  can  then  be  updated  on  a  real-time  basis. 
Furthermcm,  the  PN  paradi^  operates  cm  a  finite  database  of  all  previous  patterns  and 
does  not  have  the  same  capabilities  of  abstraction  as  the  Kohemen  network. 
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It  was  finally  decided  that  the  software  effort  would  be  better  spent  on  ARTS  than  on  the 
Probabilistic  Network  because  the  ARTS  network  paradigm  is  strongly  suggested  for  the 
following  reasons: 

a.  Adaptive  Resonance  Thetxy  m  is  the  latest  update  to  the  Adaptive  Resonance 
Theory  and  completes  the  trilogy  of  network  paradigms  developed  by  Carpenter  and 
Grossberg. 

b.  ARTS  is  a  hierarchical  version  of  ART2  allowing  for  multilayered  ART  networks 
which  continues  the  current  direction  of  development  for  NNCSS,  and  extends  the 
applicability  of  ART  in  ctxnmunication  designs. 

c.  The  Air  Force  has  provided  the  major  research  funding  for  the  development  of 
Adaptive  Resonance  Theory,  and  it  seems  appropriate  to  include  ART  O  in  an  Air  Force 
developed  neural  network  simuladon. 

The  final  list  neural  networic  paradigms  that  are  implemented  in  the  NNCSS  are: 

1.  Backpropagation , 

2.  Kohonen  Feature  Map  and  Outstar, 

S.  Fully  Recurrent, 

4.  Adaptive  Resonance  Theory  I,  n,  &  m,  and 

5.  Brain  State  in  a  Box. 

2.1.3  Candidate  Applications  for  NNCSS 


The  NNCSS  Study  includes  effort  to  create  several  prototype  applications  on  the  NNCSS 
which  integrate  the  use  of  neural  paradigms  into  signal  processing  algorithms  which  are 
embedded  in  communications  systems.  The  effort  focuses  on  demonstrating  the 
capabilities  of  the  NNCSS,  rather  than  on  producing  new  research  results  in  neural 
network  theory.  Thus  in  selecting  potential  implications,  there  were  several  objectives: 

1 .  the  applications  should  provide  tutorial  support  to  NNCSS  users; 

2.  the  theoretical  structures  of  the  applications  should  not  be  too  complex,  with  low 
technical  risk  inmlementation; 

3.  the  applications  should  represent  more  cotmnon  types  of  communications  signal 
processing  functions  which  have  been  inmlemented  widi  neural  networks; 

4.  the  inputs  required  for  the  application  should  be  easily  obtained  or  easily  simulated 
using  the  SPW  foundation  of  NNCSS. 

In  order  to  illustrate  the  potential  range  of  applications  in  communications  signal 
processing,  we  reiterate  the  types  of  applications  which  were  found  in  the  literature,  as 
referenced  in  Table  2-1;  the  nund)er  of  occurrences  of  these  applicaticxis  are  shown  in  Table 
2-3.  For  conmleteness,  this  table  includes  cxie  category  which  would  seem  to  be  a  potential 
applicadcm  area  (phased  array  antenna  control),  but  for  which  no  papers  were  found. 
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The  tabulation  suggests  that  the  most  common  applications  are  signal  detection,  channel 
equalization,  and  error  detection  and  correction.  However,  the  remaining  categories  also 
illustrate  that  useful  ad^dve  implementations  can  be  found  for  signal  processing  taslu  both 
at  the  digital  and  analog  signal  pressing  levels  in  communications  chains. 

Table  2-3  Neural  Network  Applications  in  Communications  Systems 


The  results  of  the  literature  search,  together  with  tiie  objectives  set  forth  at  the  beginning  of 
this  section,  and  a  subjective  judgment  of  the  value  of  neural  networks  (as  compared  to 
conventicmal  technology)  in  different  implication  areas  leads  to  selection  of 

1.  channel  equalization. 


2.  interference  rejecticMi, 


3.  signal  detection. 


4.  multipath  rejection  and  combining. 


5.  baseband  data  recognition  and/or  compression. 


6.  error  detection  and  correction,  and 


7.  signal  source  identification. 


as  those  applications  areas  which  were  targeted  for  further  investigation  using  the  NNCSS. 
This  further  investigation  also  was  focu^  on  the  advantages  of  neural  architectures  in 
adaptive  situations. 
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2.2  SYSTEM  ARCHITECTURE 

The  NNCSS  was  developed  by  integrating  SAICs  Neural  Network  Object  Manager 
(NNOM)  and  Industrial  Strength  Neural  Netwo-ks  (ISNN)  Library  into  Comdisco's  Sig^ 
Processing  WorkSystem™  (SPW™).  SPW  provides  interfaces  and  tools  for  integrating 
custom  function  blocks.  The  NNCSS  development  involves  developing  neural  network 
functi(Hi  blocks  and  supporting  tools  that  interact  with  ISNN  via  the  NNOM. 

Figure  2-1  gives  the  overall  architecture  and  illustrates  the  development  activity.  The  square 
rectangles  indicate  non-develc^ment  items  based  upon  already  existing  codes.  The  rounded 
rectangles  identify  development  activities  requir^  to  integrate  neu^  network  function 
blocks  to  SPW.  The  arrows  indicate  the  dependencies  among  the  configuration  items  and 
the  development  activities  required  to  integrke  function  bloclu  into  SPW. 


Figure  2-1  NNCSS  System  Architecture 

Table  2-4  describes  the  NNCSS  configuration  items  included  in  this  system  architecture, 
including  commercial  off  die  shelf  (COTS)  and  non-development  items  (NDI).  The  Neural 
Network  Commuiucations  Library  (NNCL)  Computer  Software  Configuration  Item 
(CSCI)  is  the  product  of  the  software  development  effort  in  this  contract  The  software 
design  was  documented  in  the  Software  Design  Document  (SDD)  for  the  NNCX  CSCI. 
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The  non-development  items  (NDIs)  are  documented  in  separate  documentation  and  user 
manuals  prepared  for  the  item  outsitte  of  this  contract 


Table  2-4  NNCSS  Computer  Software  Configuradon  Items 


CSCI 

COTS 

NO! 

DESCRIPTION 

SFM 

X 

X 

SPW  FOe  Management  -  manages  the  various  files  required 
by  the  other  modules  and  allows  the  user  to  enter  the  other 
CSCIs. 

BE£ 

X 

X 

Block  Design  Ediux^  -  Provides  a  graphical  environment  for 
designing  fimction  blocks  and  systems  with  results  stmed  in 
the  BDE  database,  which  can  be  accessed  by  other  CSCIs. 

SigCalc 

X 

X 

Signal  Calculator  •  Used  to  generate  the  input  source  files 
for  the  simulator  and  review  the  results  after  a  run. 

SPB 

X 

X 

Simulation  Program  Builder  -  Builds  an  interactive 
simulation  program  from  the  BDE  database  and  runs  the 
simulation.  Calls  the  function  block  codes  using  the  source 
inputs  created  by  SigCalc  and  outputs  the  results. 

CGS 

X 

X 

Code  Generator  System  (OPTIONAL)  •  Allows  the  user  to 
generate  C  source  code  to  implement  a  BDE  design 
independent  of  SPW.  The  custom  coded  expression  blocks 
must  be  defined  for  each  SPB  function  block. 

DSPCL 

X 

X 

Digital  Signal  Processing  Communications  Library  • 
provides  ftiiKtioa  block  symbols  and  implementations  that 
can  be  used  to  design  conventional  communications 
systems. 

ISL 

X 

X 

Interactive  Simulation  Library  (OPTIONAL)  -  Provides 
interactive  graphical  elements  t^  can  be  included  in  a 
design  to  allow  the  user  to  interact  with  the  simulation 
during  arun. 

NNCL 

Neural  Network  Communications  Library  •  Provides  custom 
function  blocks  for  embedding  neural  network  functions 
within  conventional  communication  designs. 

NNOM 

X 

Neural  Networit  Object  Manago^  -  Provides  a  standard 
operating  environment  for  all  neural  netwt^  paradigms  to 
support  creating,  initializmg,  saving,  loading  and  deleting 
neural  network  (Ejects  within  a  simul^on. 

3.  NEURAL  NETWORK  COMMUNICATIONS  LIBRARY 

Section  ?  summarizes  the  Neural  Network  Communications  Library  (NNCL).  The 
NNCSS  in  operation  is  simply  SPW  with  the  specific  inclusion  of  the  NNCL.  The 
information  and  instructions  necessary  for  user  interaction  with  SPW  is  presented  in  the 
SPW  references  given  in  Section  1.4.2.  The  user  of  the  NNCSS  should  first  study  and 
refer  to  the  relevant  SPW  manuals  regarding  the  operation  of  SPW  and  the  generic  use  of 
optional  function  block  libraries  such  as  IWCL.  The  Software  User’s  Manual  for  the 
NNCSS  presents  the  information  about  the  NNCL  and  its  function  blocks  that  should  be 
available  to  the  SPW  user  who  intends  to  create  block  diagrams  with  NNCL  functional 
blocks  and  to  run  simulations  that  include  NNCL  functional  blocks.  This  section  consists 
of  excerpts  firom  the  Software  User’s  Manual. 

3.1  NNCL  OVERVIEW 

The  Neural  Network  Communications  Library  (NNCL)  is  a  set  of  neural  network  functicm 
blocks  that  can  be  used  with  BDE  to  design  communication  systems  that  include  neural 
network  based  signal  processing  functions.  It  implements  selected  neural  network 
paradigms  arul  {vovides  suppcnt  functions  for  training  and  processing  neural  networks 
within  commurucation  designs.  It  also  provides  special  handling  functions  for  data 
preprocessing  and  propagation  of  vector  activations.  With  the  NNCL,  the  analyst  can 
construct  various  contigurations  of  neural  networks  and  conventional  signal  processing 
netwOTks  to  perform  signal  processing  functions. 

Artificial  Neural  Systems  represent  a  rather  diverse  set  of  approaches  to  solving  problems 
in  pattern  recognition,  image  analysis,  associative  memory,  classification,  filtering,  and 
pr^ction.  The  various  approaches  are  referred  to  as  paradigms.  The  paradigms  provide 
the  general  rules  and  procures  fOT  constructing  a  neural  network  to  perform  a  speci^c 
function.  The  common  elements  of  an  ANS  par^gm  are  the  processing  elements  which 
are  local  centers  of  computation  which  reimsent  an  artificial  neuron.  The  processing 
elements  are  connected  to  form  a  network  with  each  processing  element  receiving  input 
signals  from  other  processing  elements  in  the  netwoiic  and  generating  an  output  signal 
which  propagates  to  other  processing  elements  in  the  network.  Often  the  processing 
elements  are  grouped  into  layers  wiAin  a  network  with  the  outputs  of  the  processing 
elements  of  one  layer  being  distributed  to  the  inputs  to  the  processing  elements  in  the  next 
layer.  The  connections  among  the  processing  elements  are  weighted  such  that  the  output 
activation  signal  from  tiie  processing  element  has  either  an  excitatory  or  inhibitory  effect  on 
the  other  processing  elements. 

The  weights  represent  the  memcny  of  the  system,  and  are  formed  through  a  process  called 
training.  During  training,  exanq)les  are  presented  to  the  network  at  input  processing 
elements  that  initiate  the  propagation  of  signals  through  the  network.  The  weights  are 
adjusted  to  represent  the  inapping  of  the  input  pattern  to  an  ouq)ut  pattern.  The  paradigm 
method  for  adjusting  the  weights  is  called  the  paradigm's  learning  algorithm.  For 
unsupervised  learning,  the  input  patterns  are  OTganized  internally  to  form  categories.  Such 
algorithm's  are  self-organizing  since  they  form  their  own  output  patterns  for  each  input 
pattern.  For  supervised  learning,  the  network  is  given  the  desi^  target  for  a  given  input 
pattern.  The  network  is  trained  to  map  input  patterns  to  desired  output  patterns.  Once 
trained  the  neural  network  is  able  to  retrieve  the  desired  output  pattern  for  a  given  input 
pattern  even  if  the  input  pattern  does  not  exactly  match  any  of  the  original  training  patterns. 
The  important  characteristic  of  the  neural  network  is  its  ability  to  generalize  ^m  the 
training  exan^les. 
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Hard  Threshold 


Ramp  Threshold 


Sigmoid  Threshold  Cosine  Threshold 

Figure  3'  1  Typical  Non-linear  threshold  functions  used  in  ANS 

Another  inqxntant  characteristic  of  neural  networks  is  that  the  mapping  firom  input  patterns 
to  output  patterns  is  non-linear.  This  is  achieved  by  passing  the  output  signal  from  the 
processing  element  through  a  non-linear  threshold  function  such  as  a  ramp,  hard  threshold 
or  sigmoid  function  (see  Figure  3-1).  This  prevents  a  multi-layered  network  of  processing 
elements  (where  the  ouqnit  from  one  set  of  proce^g  elements  propagates  to  the  next  layer 
of  processing  elements)  from  being  reduced  to  a  linear  matrix  operation. 

3.1.1  NNCL  CSCI  Architecture 

The  basic  architectural  unit  of  the  NNCX  is  called  a  function  block.  Each  functitxi  block  is 
iden^ed  as  a  Computer  Software  Unit  (CSU)  in  the  NNfX  design.  A  function  block 
poforms  a  ftmcticm  within  a  signal  processing  network.  The  NNCL  provides  custom 
function  blocks  for  incorporating  neural  network  functions  into  communications  system 
designs.  There  are  two  lands  of  faction  blocks  in  the  NNCL: 

a.  Custom  Coded  Function  Blocks  (CCFBs)  -  low  level  function  blocks  that  require 
code  to  implement  the  function. 

b.  Custom  User  Function  Blocks  (CXJFBs)  -  hierarchical  function  blocks  that  are 
composed  of  a  connected  networit  of  low  level  function  blocks.  These  do  not 
require  any  custom  code  and  can  be  configured  and  saved  by  the  user. 

The  NNQ-  includes  both  low  level  (XFBs  and  higher  level  (XJFBs  that  provide  common 
configurations  of  CCFBs  to  aid  the  user  in  incorporating  neural  network  designs  into  a 
system. 


The  NNCL  function  blocks  are  logically  ordered  so  that  the  user  can  select  the  appropriate 
function  block  to  incorporate  into  a  communications  system  design.  The  logical  groupings 
and  sub-groupings  will  be  identified  as  Computer  Software  Components  (CSCs)  of  the 
NNCL  CSCI.  Table  3-1  describes  the  first  level  of  groupings  in  the  NNCL. 

Table  3-1  NNCL  Computer  Software  Components  and  Groupings 


CSC 

Grouping 

Doscriptlon 

01 

nnom 

Neural  Network  Object  Manager 

02 

manage 

Neural  Network  Management 

03 

registers 

Register  Function  Blocks 

04 

probes 

Neural  Netwexk  Instruments  and  Probes 

05 

backprop 

Backpropagation  Function  Blocks 

06 

kohonen 

Kohonen  Feature  Mrg)  Function  Blocks 

07 

Rcunent 

Fully  Recurrent  Network  Function  Blocks 

08 

an 

Adaptive  Resonance  Theory  Function  Blocks 

09 

bsb 

Brain  State  in  a  Box  Function  Blocks 

3.1.2  System  States  And  Modes 


A  function  block  has  various  representations  in  the  NNCSS  depending  upon  the  module  or 
tool  accessing  the  CSU.  Each  representation  is  referred  to  as  a  nxxlel  and  has  an  associated 
file  containing  the  information  for  the  model  A  model  of  a  function  block  will  be 
interpreted  as  a  mode  of  a  CSU  within  the  NNCSS  system.  The  specification  of  all  of  the 
models  for  a  function  block  will  provide  tite  detailed  description  for  each  CSU  in  paragraph 
3.3.  The  following  paragraphs  define  each  of  the  models  and  required  content. 

3. 1.2.1  Symbol  Model. 

The  symbol  model  defines  how  a  function  block  is  represented  in  a  system's  block 
diagram.  It  consists  of  a  function  block  symbol  (typically  a  block)  with  connectors 
defining  the  input  .  output,  and  control  signals  to  the  block  (see  Figure  3-2).  By 
conventitxi,  the  input  connectors  are  to  the  left,  and  output  connectors  are  to  the  right  of  the 
function  block  symbol.  The  connectors  from  the  bottom  indicate  Boolean  control 
parameters  that  can  vary  during  the  simulation. 

The  parameters  shown  in  the  function  block  identify  the  configuration  properties  of  the 
function  block,  which  distin^sh  the  specific  instance  of  the  block.  All  of  the  parameters 
of  a  function  can  be  shown  in  a  separate  detailed  or  parameter  view  that  is  li^ed  to  the 
symbol.  The  user  can  double  click  the  symbol  to  access  these  views  depending  upon 
whether  the  block  is  CCFB  or  CTJFB.  The  parameters  can  be  edited  to  define  a  specific 
instance  of  the  function  block. 


Context 


connector 


U3 
—  03 
3  Q. 

C  ^ 

03 


Control 


Figure  3-2  Function  Block  Symbol  Model 
3. 1.2.2  Parameter  Model. 

The  parameter  model  for  a  function  block  defines  the  state  parameters  that  are  different  for 
each  instance  of  the  function  block.  The  parameters  can  be  used  to  define  the  initial 
configuration  of  the  functicm  block  (e.g.,  number  of  layers  and  processing  elements;  learn 
rate,  etc.);  state  parameters  that  will  vary  during  the  simulation  (e.g.,  accumulated  RMS 
error,  number  of  passes;  wei^ts;  etc.);  miscellaneous  parameters  that  define  detailed 
options  or  variants  of  the  function  block  (e.g.,  momentum;  filter  gains;  scale  factors,  etc.) 
and  hidden  parameters  which  are  not  shown  to  the  user  but  are  used  internal  to  the  function 
block  (e.g.,  neural  netwmk  object;  networic  configuration  file;  etc.). 

The  parameter  model  is  di^layed  to  the  user  in  a  separate  window  or  screen  that  the  user 
can  edit  or  enter  parameter  values.  Figure  3-3  illustrates  a  parameter  screen,  which  is 
divided  into  sections.  Each  parameter  has  a  name,  and  description  that  describes  the  entry 
and  units  fear  the  parameter.  Each  parameter  has  a  default  vdue  which  is  used  if  the  user 
does  not  enter  a  new  parameter  value. 
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Function  Block  Parameters 


MAIN  PARAMETERS; 


Parameter  description  value 

Parameter  description  value 

MISCELLANEOUS  PARAMETERS: 

Parameter  description  value 

Parameter  description  value 


HIDDEN  PARAMETERS: 


Figure  3-3  Function  Block  Parameter  Model 

For  custom  coded  funcdcHi  blocks,  the  parameter  model  is  displayed  in  a  separate  window 
linked  to  the  symbol.  The  user  entries  to  parameters  in  this  window  are  passed  to  the 
function  block  during  initialization  to  configure  the  specific  instance  of  the  function  block. 
Also  any  changes  to  the  state  can  be  inserted  back  into  the  BDE  database  for  latter  viewing 
by  the  user  using  the  BDE. 

For  custom  user  function  blocks,  the  parameter  model  is  displayed  is  a  parameter  screen 
associated  with  the  detail  model.  The  user  entries  to  parameters  in  this  window  are  passed 
on  to  the  custom  coded  fimction  blocks  that  make  up  the  detailed  design.  This  allows  the 
user  to  edit  or  configure  the  low  level  function  blocks  widiout  enter  each's  parameter 
model.  Also  it  can  tre  used  to  identify  and  propagate  common  parameter  values  to  all 
constituents. 

3. 1.2.3  Detailed  Model. 

The  detailed  model  represents  the  implementation  for  custom  user  function  blocks, 
showing  the  detailed  block  dia^am  of  the  constituent  function  blocks  associated  with  the 
symbol  model.  The  detailed  model  identifies  the  constituent  function  blocks,  their  internal 
connections,  and  processing  paths  (see  Figure  3-4).  The  detailed  model  is  linked  to  the 
symbol  model  and  can  be  accessed  by  the  user  by  double  clicking  the  s:pibol  in  the  system 
block  diagram.  The  user  is  able  to  convert  any  block  diagram  into  a  hierarchical  function 
block  with  an  associated  symbol  model  using  BDE. 
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connector  connector 


Control 


Bgure3-4  Function  Block  Detail  Model 

3. 1.2. 4  Block  Model. 

For  custom  coded  function  blocks,  the  block  model  represents  the  internal  processing  of 
the  block  during  simulation.  Hie  template  fcM’  the  function  block  source  code  is  generated 
from  the  symbol  and  parameter  models.  This  establishes  the  relationship  and  interface  to 
these  models  in  the  fu^tion  block  implementation.  The  template  is  then  edited  to  include 
the  custom  code  that  will  be  called  during  the  simulation.  See  paragraph  3.1.3  for  a 
discussioi  of  the  allocation  of.processing  resources  to  a  function  block  during  simulation. 

3. 1.2.5  Expression  Model. 

The  expression  model  is  the  same  as  the  block  model  except  that  expression  model 
describes  the  code  to  be  generated  by  the  Code  Generation  System  (CGS).  The  code 
generation  can  be  customized  to  generate  only  the  necessa^  code  of  a  particular 
configuration  of  the  function  block  based  upon  the  instance's  configuration  parameters. 
Also  the  code  generated  may  be  optimiz^  for  a  specific  host  platform  or  parallel 
configuration. 

3.1.3  Memory  And  Processing  Time  Allocation 

The  memory  and  processing  time  for  each  function  block  instance  is  managed  by  the 
signal  flow  simulator  during  runtime.  The  Simulation  Program  Builder  will  reduce  a 
hierarchical  block  diagram  into  a  network  list  of  low  level  function  blocks  removing  all 
hierarchical  (custom  user)  Auction  blocks.  The  custom  code  function  blocks  are  called 
during  runtime  to  peifonn  specific  process  tasks  for  an  instance  of  the  function  block. 
The  simulatitu  will  manage  the  memory  allocation  for  all  parameters  and  external 
simals  to  the  function  block.  The  function  block  will  be  given  an  opportunity  to 
alhx:ate,  access,  and  dispose  any  internal  parameters  or  objects  for  an  instance.  Figure 
3-5  gives  the  SPB  interface  which  shows  the  support  from  the  NNOM  in  the 
implementation  of  a  neural  network  function  block. 


Hgure  3-5  SPB/NNOM  Inteiface.for  NNCX  Function  Block 


3.2  THE  COMPUTER  SOFTWARE  COMPONENTS  OF  NNCL 

3.2.1  Neural  Network  Object  Manager  (CSCOl) 

CSC  Purpose.  This  CSC  provides  the  standard  interface  between  the  SPW  function 
blocks  and  the  Industrial  Strength  Neural  Network  (ISNN)  Library.  Each  neural 
netwodc  function  block  must  have  a  representation  in  the  BDE  databi^  and  a 
coiresponnding  neural  network  object  in  die  NNOM  database.  Figure  3-6  illustrates 
the  in^aces  for  die  construction  of  neural  networics  within  the  NNCSS. 

Execution  and  control  data  flow.  The  neural  network  object  management  routines  are 
implemented  as  standard  C  functions  that  are  called  from  within  custom  SPW  function 
block  in^lementatioas.  The  primary  object  managed  by  the  NNOM  routines  is  a  neural 
network  object  that  can  encapsulate  other  neural  network  objects  as  well  as  internal 
parameters  and  registers.  The  NNO's  are  created  and  setup  by  the  NNOM  from 
parameters  defined  in  the  BDE  priOT  to  a  simulation  run.  The  I^OM  saves  the  NNO's 
to  binary  and  text  files  for  loading  in  subsequent  simulations.  Also  the  NNOM  routines 
translate  and  report  the  ISNN  error  code  status  to  SPW  error  reporting  and  processing. 
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Figure  3-6  Neural  Netw«ic  Builder  Interfaces 

3.2.2  Neural  Network  Management  (CSC02) 

CSC  Pinpose.  This  CSC  provides  general  function  blocks  to  manage  the 
processing  of  neural  networks  during  a  simulation.  This  includes  custom  coded 
inunction  blocks  for  cycling  the  system  through  multiple  passes  of  the  input  signal  Hies 
for  embedded  neural  network  training;  and  general  custom  user  function  blocks  to 
define  typical  configurations  of  neural  netwmk  and  preprocessing  function  blocks. 

Execution  control  and  data  flow  Figure  3-7  gives  the  general  function  block 
architecture  for  management  of  neural  network  objects  within  the  NNCSS.  The  NNOM 
function  block  creates  an  object  manager  to  contain  the  neural  network  objects.  At  the 
top  level  the  NNOM  manages  all  neural  networic  objects  including  instantiation, 
checkpointing,  writing,  and  disposal  of  the  system  of  neural  network  objects.  At 
system  initialu^on,  it  either  loa^  a  previously  saved  neural  network  configuration  or 
creates  a  imom  to  manage  the  neural  network  objects  created  within  die  simulation.  The 
resulting  imom  is  output  to  all  neural  network  objects. 

The  neural  network  object  function  block  is  typically  a  specific  neural  network 
paradigm  process.  The  neural  network  object  function  block  eidier  retrieves  a 
pr^ously  saved  neural  network  object  from  the  nnom  at  creates  a  new  neural  network 
object  to  be  managed  by  the  nnom.  The  function  block  uses  the  BDE  defined 
parameters  to  setup  &  furiction  block  during  system  initializatioiL  Afrer  all  the  function 
blocks  have  setup  dieir  neural  network  objects,  the  NNOM  functitm  block  allocates  and 
instantiates  all  of  the  objects  at  the  start  of  processing.  During  the  simulation,  the 
NNOM  periodically  saves  binary  versions  of  the  neural  network  objects  during 
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checlqx>inting.  At  the  termination  of  the  simulation,  the  top  level  NNOM  writes  the 
system  to  a  text  configuration  file. 

Each  neural  netwcffk  object  has  control  inputs  that  define  the  processing  state  of  the 
object  These  include  hold  -  suspends  processing  whenever  hold  is  greater  than  0.0; 
learn  -  activates  the  paradigm  learning  algorithm  when  greater  than  0,  and  relax  - 
performs  the  leam  ep<xh  up^te  processing  during  the  pass. 


Figure  3-7  -  Neural  Network  Object  Management  Component  Architecture 


The  Neural  Network  Object  Controller  (NNOC)  function  block  coordinates  these 
control  signals  during  a  simulation.  It  generates  the  iq>propnate  leam  and  relax  control 
signals  to  implement  multiple  training  epochs  during  a  simulation.  It  supports  feedback 
of  a  delta  cttot  signal,  such  that  the  rms  or  max  output  error  from  the  innov  register  can 
be  used  to  threshold  ihc  learning  during  the  backward  pass  through  the  neural  netwcnk 
object  The  NNOC  is  used  primly  for  controlling  supervised  neural  network  objects. 

For  unsupervised  learning,  the  Neural  Network  Object  Control  Clock 
(NNOC.CLOCK)  (not  shown)  is  used  to  generate  an  inner  and  outer  loop  counter 
signal  that  allows  various  neurd  netwmk  objects  to  be  processed  at  a  higher  rate  than 
the  odio:  function  blocks.  This  allows  the  intonal  cycles  of  die  neural  netwotk  object  to 
be  observed.  The  ou^uts  from  the  NNOC_CL(X!!K  are  control  signals  that  are 
connected  to  the  function  block's  hold  pin.  The  outer  loop  (pass)  control  signal  is  held; 
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the  function  blocks  on  the  inner  loop  (cycle)  are  activated  for  a  simulation  cycle.  At 
each  epoch,  the  pass  control  signal  is  held  down  so  that  the  other  function  blocks  can 
execute. 


3.2.3  Register  Function  Blocks  (CSC03) 


CSC  Purpose.  This  CSC  extends  the  standard  library  of  vector  function  blocks  to 
provide  activation  function  processing  commonly  required  for  neural  network 
processing.  This  includes  nora^zation,  shift,  delay,  window,  merge,  split,  and  error 
registers  used  in  neural  network  designs. 


Execution  and  aMitrol  data  flow.  The  activation  function  blocks  are  special  vector 
processing  functions,  which  are  used  to  prepare  the  activation  data  for  a  neural  network 
paradigm.  These  are  based  upon  existing  vector  operators  in  the  DSP  Communications 
Library.  There  will  be  additional  vector  operations  required  to  connect  neural  network 
function  blocks  within  a  communications  system.  These  include: 


a.  Shift  Register.  A  vector  input  is  shifted  onto  the  top  of  an  array  such  that  the  last,  t, 
inputs  are  represented  within  the  register.  The  full  register  is  output  at  each  time 
step,  providing  a  fixed  window  sample  of  the  time  sequence  of  input  vectors. 

b.  Delay  Register.  The  same  as  a  shift  register  except  that  the  bottom  vector  sample  in 
the  register  is  ouq>ut  with  each  time  step,  effectively  delaying  the  output  by  the 
numbCT  of  shifts  in  the  register. 

c.  Merger  Register.  Combines  vector  inputs  of  different  sizes  into  a  composite  vector 
at  the  output 

d.  Split  Register.  Splits  a  laiger  vector  into  two  smaller  vectors. 

e.  Offset  Register.  Accesses  a  vector  at  a  constant  offset  from  a  larger  vector. 

f .  Normalization  and  Denormalization  Registers.  Perform  the  required  normalization 
and  denormalization  in  and  out  of  a  neural  network  function  block. 
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3.2.4  Neural  Network  Instruments  and  Probes  (CSC04) 

CSC  Purpose.  This  CSC  extends  the  ISL  to  provide  custom  user  function  blocks 
that  are  particularly  useful  for  the  display  and  control  of  neiual  network  simulations. 
These  blocks  provide  instruments  and  probes  to  assist  in  the  training  and  validation  of 
neural  network  based  simulations. 

Execution  and  cnntml  data  flow  These  function  blocks  shall  be  implemented  using 
ISL  low  level  function  blocks  and  will  follow  the  execution  and  control  data  flow 
defined  for  these  latter  functions.  Function  blocks  which  will  be  controlled  by  these 
instruments  will  provide  the  impropriate  control  pins  for  attaching  the  instruments  with 
a  system  design.  General  data  extractitm  and  data  set  function  blocks  shall  be  used  to 
extract  or  control  internal  parameters  and  vectors  within  a  neinal  network  object 

3.2.5  Backpropagation  Function  Blocks  (CSC05) 

CSC  Purpose.  This  CSC  provides  custom  function  blocks  for  implementing  the 
Backpropagation  Neural  Network  within  a  conununications  design.  The  CSC  includes 
custom  coded  function  blocks  that  implement  forward  and  backward  processing 
algorithms,  and  multilayered  feed  forward  neural  networks.  It  also  provides  custom 
user  function  blocks  which  allow  backpropagation  to  be  incorporated  in  various 
configurations  and  to  perform  various  fiuu^ons  within  a  communications  design. 

Execution  and  control  data  flow.  The  execution  and  ccmtrol  data  flow  is  defined  by  the 
propagation  architecture  of  the  Backpropagation  Neural  Network.  The  following 
detiGies  the  overall  architecture  to  be  inml^mented  by  this  CSC. 

The  Baclqnopagaticm  Neural  Network  is  one  of  the  noost  important  and  widely  imP^cd 
ANS  paradigms.  The  backpropagation  network  is  a  form  of  supervised  learning. 
Figure  3-8  illustrates  the  netw<^  architecture.  The  network  is  fom^  of  one  or  more 
layers  of  processing  elements.  The  first  layer  represents  the  input  pattern  and  Ae  output 
of  these  elements  ate  connected  to  the  input  of  each  of  processing  elements  in  the  next 
layer  and  so  on  through  the  network.  As  the  signals  feed  forward  through  the  netwenk. 
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Hidden  Output 

Processing  Layer  Processing  Layer 


HgureS-S  Backpropagaticxi  Network  Architecture 

each  processing  elonent  takes  the  weighted  sum  of  the  activation  input  signals  and 
passes  this  sum  through  a  sigmoid  thre^ld  function  to  generate  one  output  signal  to 
each  of  the  elements  in  the  next  layer.  Hie  output  from  the  last  layer  in  the  network 
(called  the  ouqiut  layer)  represents  the  learned  or  retrieved  pattern. 

Initially  the  connection  weights  are  set  to  random  values.  During  training,  example 
input  patterns  are  propagated  through  the  netwcvk.  The  output  pattern  is  compared  to 
the  desired  target  pattm.  The  difference  between  the  output  and  target  patterns  is  the 
error  which  is  back-prqiagi^  through  the  same  connections.  During  backpropagation 
each  processing  element  adjusts  its  connection  weights  slightly  and  propagates  a  delta 
error  to  the  preceding  layer  of  processing  elements.  The  next  time  the  input  examples 
are  propagated  through  the  network,  the  output  pattern  is  generally  closer  to  the  target 
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pattern.  The  error  is  again  back-propagated  and  the  training  cycle  repeated  until  the  total 
RMS  error  over  the  training  set  converges  to  an  arbitrarily  small  value.  The 
backpropagation  network  implements  gradient  descent  learning  which  means  that  the 
weights  are  adjusted  such  that  the  total  RMS  error  decreases  to  zero. 

Figure  3-9  gives  the  learning  algorithm  for  the  network.  It  illustrates  the  algorithm  for  a 
processing  element  as  two  processing  paths  --the  path  of  the  forward  activation  signal 

Forward  Activation  Signai 


dpi 


Backward  Deita  Error  Signai 


Figure  3-9  Backpropagation  Learn  Algorithm  for  a  Layer 

and  the  path  of  thebackward  propagated  delta  error  signal.  The  two  processing  paths 
share  the  same  connection  weights  between  weight  changes.  The  input,  output  and 
target  patterns  of  the  forward  signal  path  are  used  to  compute  the  changes  to  the 
weights  during  the  backward  processing.  This  algorithm  is  performed  on  all  of  the 
processing  elements  (j)  within  a  layer. 

The  delta  error  baclq)ropagated  from  a  single  processing  element  is  a  vector  that 
corresponds  one-to-one  with  the  input  acdvadon  vector.  In  turn,  each  processing 
element  in  a  hidden  layer  receives  a  delta  error  component  from  each  processing 
element  in  the  next  layer  to  which  it  is  connected.  During  backpropagation  ev^ 
weight  in  the  oidre  network  receives  a  specific  delta  error  which  is  u^  in  the  equation 
to  change  diat  weight 
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For  optimal  gradient  descent  learning,  the  delta  weight  changes  are  accumulated  for  a 
designated  numbor  of  cycles,  and  then  applied  to  change  the  connection  weights  at  the 
end  of  each  update  interval.  To  train  the  neural  network  requires  multiple  passes 
through  the  training  data  or  multiple  training  epochs  through  the  input  stream. 

3.2.6  Kohonen  Feature  Map  Function  Blocks  (CSC06) 

CSC  Purpose.  This  CSC  provides  custom  function  blocks  for  implementing  and 
training  Kohonen  Tqwlog^cal  Feature  Map  neural  networks.  The  CSC  includes  cust(»n 
coded  function  blocks  for  unsupervised  learning  and  recall  processing  algorithms.  It 
also  provides  custom  user  function  blocks  that  allow  Kohonen  networks  to  perform 
various  functions  within  a  communications  design. 

Execution  and  control  Hata  finw.  The  execution  and  control  data  flow  is  defined  by  the 
propagation  architecture  of  the  Kohonen  Topological  Feature  Map.  The  following 
defies  the  overall  architecture  to  be  implemented  by  this  CSC. 

The  Kohonen  Topological  Feature  Map  provides  a  method  for  creating  networks  that 
can  be  trained  to  classify  input  vectors  while  preserving  the  inherent  topology  of  the 
training  set  Topological  preserving  maps  mean  that  the  nearest  neighbor  relationships 
in  the  training  set  ate  preserved  in  the  netwcnk  such  that  input  vectors  presented  to  the 
network  that  have  not  been  previously  "learned"  will  be  categorized  by  its  nearest 
neighbor  in  the  network's  learned  exemplar  set 

Such  a  network  becomes  a  parameterized  version  of  a  Bayesian  classifier,  except  that 
the  probability  distribution  does  not  have  to  be  known  a  priori.  The  (xUy  r^uirement  is 
that  the  training  set  has  to  be  a  representative  sample  of  the  distribution  of  input 
patterns.  Once  a  network  has  been  trained,  it  will  provide  near  optimal  performance  in 
classifying  input  patterns  with  minimal  processing. 

Its  ability  to  map  unknown  distributions  can  extend  the  use  of  the  network  to  areas 
where  Bayesian  approaches  are  intractable  -  such  as  voice  recognition  and  vision 
processing.  It  can  be  used  for  feature  extraction  within  a  noore  complex  network  such 
as  counter  propagation,  cv  as  a  filter  in  a  sigiud  processing  network. 
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Figure  3-10  Kobonen  Topological  Feature  Map  Architecture 

Figure  3-10  gives  the  network  architecture.  The  network  consists  of  an  input  vector  of 
dimension  N  that  is  fiilly  omnected  to  a  set  of  M  processing  elements,  where  M  is  the 
number  of  exemplar  vectors  for  the  mapping.  The  M  processing  elements  are  arranged 
to  form  a  two  dimnsirmal  array  at  the  ou^ut  The  dimensions  of  the  array  can  be  any 
size,  but  are  typically  arranged  to  form  a  square  array  with  elements  per  side.  The 
output  horn  the  netwrak  is  a  vector  of  M  elements  given  by: 


where  the  winning  node  yj  is  the  minimum  element  value: 

yj  =  mm 


3.2.6-1 


3.2.6-2 


giving  the  nearest  exemplar  vector  {wj)  to  the  input  vector  (x).  The  network  can  also 
work  when  maximum  activations  are  used  during  training.  In  that  case,  the  exemplar 
vectors  are  in  the  hyperplane  normal  to  the  input  vecttv. 

During  training  the  cxcmplax  vectors  are  arranged  into  a  two  dimensional  topological 
map  where  the  output  values  will  increase  proportionally  to  the  distance  from  the 
minimum.  This  two  dimensional  configuration  gives  the  maximum  degree  of  fieedom 
fcM:  representing  nearest  neighbor  relationships  that  are  not  strictly  ordered.  A  strictly 
order^  relationship  requires  that  the  distance  among  the  exemplar  categories  be 


transitive  (A  <  B  <  C  implies  A  <  C  and  that  B  is  between  A  and  C).  With  square 
arrays,  we^y  ocdercd  and  quasi-ordered  reladons  can  be  accommodated.  The  ordering 
can  have  a  geometric  interpretation  for  simple  two-dimensional  vectors.  For  example,  ^ 
the  input  vector  has  two  dimensions  whose  element  values  are  distributed  uniformly 
over  the  interval  [0.0  to  1.0],  then  the  set  of  exemplars  will  span  the  space  uniformly 
such  that  the  output  values  increase  proportionally  to  the  distance  from  the  minimum 
element  in  the  output  array.  For  higher  dimensional  input  vector  spaces,  such  a 
geometric  interpretation  is  not  possible.  The  resulting  output  array  will  be  a  minimum 
spanning  tree  where  regions  of  the  array  will  become  excited  by  the  input  vector 
according  to  their  distance  relative  to  this  relational  metric. 


Hgure  3-1 1  Kohonen  Neighborhood  Diagram 

To  achieve  a  parameterized  topological  map,  the  netwOTk  is  presented  with  a  training 
set,  which  is  a  random  saitple  of  the  values  that  the  elements  of  the  input  vector  can 
have.  Initially,  the  exeoqilar  weights  are  randomized  unit  vectors.  When  an  exemplar 
vector  is  fou^  nearest  to  the  input  training  vector,  the  otho*  vectcv  weights  are  updated 
in  proportion  to  their  distance  horn  the  minimum  in  the  output  array  for  a  neighb^hood 
about  the  minimum.  The  dynamics  of  the  weight  equation  are  given  by: 

AWij  =  a  (jCi  -  Wij)  3.2.6-3 

where  j  is  in  a  neighbortiood  of  the  jth  node  in  y,  and  a  is  the  learn  rate  for  the  weight 
connections.  The  neigUxahood  is  defined  by  a  neighbmhood  d^tl^  giving  the  number 
of  nodes  from  the  minimum  to  be  updated.  This  is  iUustr^ed  in  Figure  3-11.  At  first, 
the  entire  array  of  exemplar  vectors  is  adjusted  about  the  minimum  vector  for  each 
input  After  a  number  of  iterations,  the  region  about  the  minimum  is  gradually  reduced 
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until  only  the  nearest  neighbors  are  updated.  Subsequent  iterations  refine  the  exemplar 
vectors  by  reducing  the  learning  rate  according  to: 


a 


I 

T 


1-^ 


pj 


3.2.6-4 


where  p  designates  the  learning  phase,  Tp  is  the  maximum  number  of  interations  within 
the  current  learning  phase,  tp  is  the  number  of  iterations  since  the  beginning  of  the 

learning  phase,  and  is  the  initial  learn  rate  for  the  phase. 

At  the  end  of  a  learning  phase,  tp  =Tp,  the  maximum  time  is  increased  by  a  factor  of 
5,  Tp  =  5Tp-l,  and  the  learn  rate  is  decreased  by  a  factor  of  5,  Up  =  ap-i  15. 

Likewise,  during  the  initial  phase,  p  =  0,  the  neighborhood  depth,  co,  is  decreased  as  a 
flmction  of  the  interations  by: 


(O 


=  (0. 


3.2.6-5 


where  (Do  is  the  initial  neighborhood  depth,  set  to  one-half  of  the  output  array  side 

dimension.  In  subsequent  phases,  (0  =  1,  indicating  only  the  nearest  neighbors  are 
updated. 

Typically,  100,000  iterations  are  required  to  achieve  an  optimal  mapping  function, 
however  the  basic  topology  is  achieved  during  the  initial  training  phase.The  resulting 
parameterized  map  tre  a  set  of  exemplar  vectors  of  near-unifoim  length  that  will 
model  the  input  pliability  distribution,  llie  exemplars  will  be  arranged  such  that  each 
exemplar  is  equally  likely  to  capture  an  input  vector.  Regions  of  high  probability  will 
have  more  exenqilv  vectors  than  regions  of  low  probability. 

3.2.7  Fully  Recurrent  Network  Function  Blocks  (CSC07) 

CSC  Purpose.  This  CSC  provides  custom  function  blocks  for  implementing  and 
training  fully  recurrent  back  propagation  neural  networks  within  an  embedded  system 
design.  The  CSC  includes  custom  coded  function  blocks  that  implement  forward  recall 
and  errOT  fe^back  learning  algorithms.  It  also  provides  custom  user  function  blocks 
which  allow  recurrent  network  to  be  incorporated  in  various  conHgurations  and  to 
perform  various  functions  within  a  ccxnmunications  design. 

Execution  and  control  data  flow.  The  execution  and  control  data  flow  is  defined  by  the 
propagation  architecture  of  the  Fully  Recurrent  Network.  The  following  defines  the 
overall  architecture  to  be  implemented  by  this  CSC. 

The  Fully  Recurrent  Network  is  similar  to  the  Baclqnopagation  Netwoik  except  that  the 
neural  network  learns  to  map  sequences  at  the  input  into  sequences  at  the  output  A 
fully  recurrent  network  consists  of  a  set  of  processing  elements  which  are  fully 
connected  with  every  other  processing  element  in  the  network  and  with  the  input 
activation  vecttx*.  Figure  3*12  illustrates  the  fully  recurrent  network  architecture  as 
developed  by  William  and  Zipser. 
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Figure  3-11  Fully  Recurrent  Network 

The  input  to  the  network,  ziit),  consists  of  the  input  activation  vector,  x\lt),  combined 
with  the  output  activation  vector  of  the  previous  cycle  from  all  of  the  processing 
elements,  ykity. 


Xt(0  ifkel 

y,(0  tfkeU 


3.2.7-1 


where  k~l  ...M  -^N  where  M  is  the  number  of  external  inputs  and  JV  is  the  number 
of  processing  units.  The  sets  /  and  U  designate  the  input  and  output  indices 
res^tively. 


The  ouq>ut  for  the  kth  element  at  the  next  time  step,  yidlt+l),  is  given  by  the  sum  of  the 
weight^  connections  for  the  kth  processing  element,  passed  through  the  unit's 
squashing  fimcdon 
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3.2.7-2 


5t(0  =  S  Wh>^,(0  +  X  W«JCi(0  =  X  W«Z/(0 

ImU  ttl  ItUul 


3'.(»  +  1)  =  /.(5.(<))  3.2.7-3 

The  recurrent  application  of  the  output  to  the  input  allows  the  network  to  form  its  own 
intemal  organization  for  learning  time  sequences  at  the  input  (i.e.  recurrence  of  patterns 
at  the  input)  and  it  also  allows  the  network  to  form  its  own  intemal  layers  over  several 
iterations  of  the  input  pattern. 

The  output  firom  the  network  consists  of  a  subset  of  the  processing  element  outputs 
(yreg)  which  are  trained  from  target  vectors  (dreg).  The  processing  elements  which  are 
not  direcdy  trained  are  used  internally  to  the  network  as  hidden  activations.  Let  T(t) 
designate  the  set  of  active  indices  for  the  output,  at  time  t  for  which  there  is  a 
corresponding  target  element,  dk(t),  then  the  delta  error,  is: 


^*(0  = 

1 


P 


ifkeT(t) 

otherwise 
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The  delta  error  is  back  propagated  to  all  of  the  units  and  the  weights  updated  using: 


Awij(0=a  X  eAOpUo 


**T(0 


3.2.7.5 


where  a  is  the  learning  rate  and  p^i/t)  is  kth  unit's  contribution  to  the  weight  change 
given  by: 

PI(‘  + 1)  =  =  f'(sX0fzw^P‘^  +  5.Z^(0]  3.2.7-6 

Since  p^ij  incorporates  the  weighted  sum  of  all  other  p^u  then  equation  3.2.5-S 
distributes  the  error,  ek(0  over  aU  of  the  weights,  wy,  including  the  input  connected 
weights.  Each  processing  element  accesses  the  entire  connection  wei^t  matrix  and 
accumulates  delta  weight  adjustments  in  intemal  registers  (pregs)  for  each  weight  and 
each  processing  element .  At  the  erxl  of  each  input  time  step  the  weights  are  adjusted  in 
preparation  for  the  next  input.  Figure  3-13  shows  the  process  block  diagram  for  a 
training  cycle. 

Because  the  network  is  conditioned  by  both  the  input  activations  and  previous  output 
activations,  there  are  a  number  of  options  which  can  be  employed  during  training.  The 
Hrst  has  already  been  mentioned  and  involves  the  update  of  the  weights  at  each  input 
iteration.  This  allows  the  network  to  train  in  real-time  on  a  stream  of  input  activations 
without  accumulating  delta  weight  changes.  As  a  result  the  network  can  be  applied  to 
soeams  of  various  lengths  in  teal  time  without  having  to  define  epochs  often  decreasing 
by  orders  of  magnitude  the  number  cycles  required  to  train  the  network.  This  is  referred 
to  as  the  realtime  option. 
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Figure  3-12  Fully  Recurrent  Network  Dynamics  for  Learning  Algorithm 

The  shifting  of  the  output  to  the  input  for  the  next  input  iteration  presents  two  options. 
The  first  is  to  recycle  die  output  generated  by  the  network,  and  the  second  is  to  replace 
the  generated  output  with  the  target  output  for  die  next  cycle.  The  latter  is  referred  to  as 
forced  learning.  With  forced  learning,  the  network  can  learn  sequences  which  would  be 
impossible  using  the  generated  outputs.  Ibese  situatitms  involve  learning  to  bifurcate 
an  input  value  to  more  dian  one  output  value  depending  upon  previous  histc^  of  output 
activations.  For  example,  learning  a  sine  wave,  the  network  can  distinguish  between 
the  value  0.74  predicting  a  higher  or  Iowa-  value  depending  upon  the  previous  output 
being  lower  ot  higher.  Widiout  forced  learning  tire  network  will  attenqit  to  converge  on 
a  value  which  does  minimizes  the  error  between  the  two  predicted  values  such  as  0.74. 

Another  important  option  is  that  the  target  vector  can  change  over  the  processing 
elements  so  that  different  subsets  of  processing  elements  are  selected  as  output 
activations  at  each  time  step.  This  allows  certain  processing  elements  to  be  specialized 
to  certain  target  events  while  still  being  available  internally  as  hidden  processing 
elements  for  other  events.  Because  the  processing  element  is  sometimes  an  output 
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processing  element  and  other  times  a  hidden  element,  its  behavior  is  different  from 
those  that  are  stricdy  output  or  hidden.  These  processing  elements  would  attempt  to 
form  weight  ctmnections  which  attract  certain  patterns  representing  the  target  and  repel 
at  various  degrees  other  patterns  which  do  not  map  to  the  target  This  implies  that  these 
processing  elements  can  be  self  organizing  similar  to  an  adaptive  network  such  as  a 
Kohonen  topological  feature  map. 

Another  useful  option  is  to  delay  the  application  of  a  target  vector  for  a  specified 
number  of  cycles  for  an  input  In  Figure  3-12,  this  is  shown  as  shift  registers  for  the 
input  and  tar^t  acdvatitms.  The  input  and  target  vectors  are  placed  at  the  tq)  of  the  shift 
register  and  shifted  down  with  each  new  inputAarget  pair.  The  input  vector  is  taken 
from  dw  tq)  of  the  input  shift  register  and  die  geners^  output  is  coiqiared  to  the  target 
at  the  bottom  of  the  target  shift  register.  The  output  c<»responds  to  the  input  vector 
placed  on  the  shift  register  n  cycles  previously.  The  network  is  process  and  trained  just 
as  before  except  that  the  output  is  delayed.  What  this  opdon  does  is  cause  the  hid^n 
processing  elemoits  to  form  layers  similar  to  backprqia^tion  layers. 

3.2.8  Adaptive  Resonance  Theory  Function  Blocks  (CSC08) 

CSC  Purpose.  This  CSC  provides  custom  function  blocks  for  implementing  and 
training  Adaptive  Resonance  Theory  (ART  1,  2  and  3)  neural  networks  within 
embedded  system  designs.  The  CSC  includes  custom  coded  function  blocks  for 
unsupervised  learning  and  recall  processing  algorithms.  It  also  provides  custom  user 
function  blocks  which  allow  ART  netwtxrks  to  perform  various  functions  within  a 
communicaticxis  design. 

Execution  and  control  data  flow.  The  execution  and  ccmtrol  data  flow  is  defined  by  the 
propagation  architecture  pf  Adaptive  Resonance  Theory.  The  following  defines  the 
ovo^  architectures  to  be  implemented  this  CSC. 

Adaptive  Resonance  Theory  .  Binary  (ART  1) 

ART  1  is  the  first  paradigm  developed  by  Carpenter  and  Grossberg  based  upon  the 
adqttive  resonance  theory  by  Grossberg.  hi  the  latter  pqier,  Grossberg  investigated  die 
instability  of  non-linear  systems  that  incOTporate  fee^ack,  and  determine  stable 
processes  that  are  adaptive  and  self-organizing.  ART  1  is  qiplied  to  binary  input 
vectors  (i.e.,  the  elements  of  the  vector  are  either  0.0  or  1.0). 

ART  1  Architecture.  The  networic  consists  of  two  sets  of  nodes.  The  first  set,  FI, 
is  connected  to  die  input  vector  and  presents  a  short  term  menoo^  (STM)  activation 
vector  at  its  ouqiut  Tte  second  set,  F2,  generates  an  ou^ut  activation  vector  giving  the 
recalled  category  from  long  term  menxxy  (LTM)  for  the  current  input  vector.  Initially, 
the  STM  from  FI  is  activaied  by  the  input  vector.  This  pattern  activates  all  of  die  nodes 
in  LTM  concurrently  via  bottom-up  weights  from  FI  to  F2.  These  activatimis  compete 
until  F2  becmnes  active  with  a  candidate  categcffy  of  a  pattern  stored  in  LTM.  The  FI 
STM  dial  receives  die  recalled  LTM  pauera  via  tq>-<mwn  weights  from  F2  to  FI.  If 
this  pattern  is  not  sufiSciently  close  to  the  input  pattern,  a  strcmg  inhibit  signal  is  sent  to 
F2.  This  suppresses  the  winning  categcvy  and  another  category  beconoes  active.  If  the 
new  category  matches  the  input  pattern,  the  system  becomes  stable  and  resonates  for  a 
sufficient  time  interval,  such  that  the  LTM  pattern  is  reinforced  by  die  new  input  pattern 
and  learning  occurs. 

The  learning  process  involves  updating  the  bottom-up  and  top-down  weights 
representing  the  LTM  patterns.  If  an  input  pattern  has  not  fmviously  been  present^  to 
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the  netwoik  and  is  sufficiently  different  foe  the  other  input  patterns  stored  in  LTM,  the 
new  pattern  is  eventually  int^uced  to  LTM.  This  ability  to  learn  new  patterns  while 
distingmshing  previously  learned  patterns  represents  the  adaptive  feature  of  ARTl. 
ARTl  is  also  self-oigpizing.  The  categories  are  selected  and  the  members  of  the  those 
categories  are  determined  by  the  properties  of  the  network  and  not  from  an  external 
target  vector  as  is  the  case  of  supendsed  learning  paradigms  such  as  back  propagation. 

The  properties  of  an  ARTl  netwOTk  are  defined  by  a  set  of  parameters  representing  the 
coefficients  of  the  general  dynamic  equations  for  the  network.  These  dynamic  equatitxis 
are  based  upon  a  dunensionless  form  of  the  membrane  equatitms  developed  by  Lin  and 
Segal.  Though  the  number  of  parameters  has  been  r^uced  to  a  minimum,  there 
remains  a  relatively  large  number  that  must  be  defined  to  achieve  a  stable  network. 
Therefore,  tuning  an  ARTl  network  to  a  particular  problem  requires  understanding  Ae 
dynamics  of  the  network  from  the  equatitms  which  (kfine  the  network.  The  essential 
equations  are  presented  without  derivation  to  allow  the  user  to  experiment  wiA  the 
m^el  and  develop  intuition  as  to  the  model's  performance  under  various  parameter 
settings.  For  a  more  complete  theoretical  understanding  of  the  properties  of  the  ARTl 
netwo^  and  as  a  foundation  to  the  other  ART  networlu  refer  to  the  referenced  papers 
in  Chapter  2. 

ART  1  STM  Equations.  We  denote  the  activation  vectors  from  FI  and  F2 
with  the  subscripts  i  =  1,  2,  3,...,N  and  j  =  1,2,3,...,M  respectively.  N  is  the  number 
of  elements  in  the  input  vector  and  M  is  the  total  number  of  available  categories  in 
LTM.  The  parameters  A,  B,  C,...  for  layers  FI  and  F2  are  denoted  with  subscripts  1 
and  2,  respectively.  The  equations  governing  the  dynamics  of  the  activation  vectors 
xi(t)  and  xj(t)  for  FI  and  F2  are: 

~  3.2.8-2 

where  Jk*^  is  the  total  excitoty  input  at  unit  k  =  i  or  j,  and  Jk’  is  the  total  inhibitory  input 
to  unit  k.  All  of  die  parameters  are  non-negative.  If  A  >  0  and  C  >  0,  then  the  elements 
of  the  activation  vectexs,  xi(t)  and  xj(t),  will  remain  in  the  finite  interval  A~^] 

regardless  of  how  large  the  non-negative  iiqiuts  Jk*^  and  Jk'  become. 

The  excitory  input  Ji'*’  for  the  ith  node  of  FI  is  the  sum  of  the  bottom-up  input  li  and 
the  tt^i-down  template  input  from  F2: 


J>I,*D,Y.f{x)z,  3-2-8-3 

J 

where  f(xj)  is  the  threshold  signal  generated  for  an  F2  node  by  the  activation  xj,  and  zji 
is  the  top  down  LTM  weight  from  F2  to  FI. 

The  inhibitory  input  Jf  is  derived  from  all  active  nodes  in  F2: 
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3-2-8-4 

J 

If  F2  has  at  least  one  element  active,  then  Jf  >  0  and  has  a  non-specific  inhibitory  effect 
on  all  of  the  units  of  FI. 


The  excitory  input  Jj+  to  the  jth  node  of  F2  comes  form  the  bottom-up  trace  of 
activations  fiixxn  FI  and  the  positive  feedback  signal  g(xj)  to  itself: 


J1^8(xt)-^D2\Th  (x)zii-aj 


3.2.8-5 


where  h(xi)  is  the  threshold  signal  generated  by  FI  at  the  activation  xi  and  zij  is  the 
bottom-up  LTM  weights  from  FI  to  F2.  Here  the  weights  zij  and  zji  denote  two 
different  matrices  and  not  the  transposition  of  the  same  matrix.  The  vector  element  aj'*' 
is  the  excilcxy  activation  from  the  orienting  subsystem  whidi  indicates  the  category  is  a 
learned  category.  If  biases  the  competition  to  first  find  categories  which  have  been 
previously  learned  before  starting  a  new  category  for  the  input  vector.  This  is  discussed 
further  later. 


1 


The  inhibitory  input  Jj~  to  the  jth  node  of  F2  is  the  competive  negative  feedback  from 
all  of  the  other  nodes  ui  F2: 


k0j 


3.2.8-6 


where  aj'  is  the  inhibitory  activation  sigiud  from  the  orienting  subsystem  when  the 
category  does  not  match  the  current  input  pattern.  When  active,  it  suppresses  the 
wiiuiing  category  so  that  other  categories  will  be  considered.  The  dynamics  the  reset 
or  orienting  subsystem  are  discuss^  in  the  subsection  entitled  "(Renting  Subsystem 
Equations"  for  ART  1. 

ART  1  LTM  Equations.  The  dynamics  for  the  LTM  weights  are  as  follows: 

=  f{xj)[-Zi,  +  h  te)]  3-2-8-7 

^z,=Kif{xj^i-2i)Lih  (xil-Zu'Lh  (jr.)]  3.2.8-8 

When  a  stable  rescHiant  configuration  in  the  STM  presists,  the  weights  are  updated  by 
the  above  dynamic  equations.  For  a  stable  network,  it  is  inqxntant  that  the  weights  are 
modified  at  a  time  scde  that  is  long  compared  to  the  update  of  the  STM  activations. 

This  is  achieved  by  keeping  the  time  scale  for  the  learning,  5t,  small  while  allowing  E 

to  be  large  enough  such  that  E5t  =  1.  After  the  input  activation  hasn  been  presented  for 
a  sufficient  interval  to  indicate  that  the  system  is  resonating,  the  ARTl  model  solves  the 
dynamic  equations  and  updates  the  weights  to  their  asymptotic  values  for  t »  0.  This 
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is  refened  to  as  the  "quick  leam  mode".  The  quick  learn  mode  is  engaged  when  the 
following  condition  is  satisfied: 


counter  = 


3.2.8-9 


where  dt  is  the  rime  constant  and  counter  is  die  number  of  iterations  since  the  last  reset. 
E  is  a  parameter  of  the  ART  1  implementation  that  controls  LTM  learning  rate  and  is  not 
originally  a  Orossberg  ART  1  parameter. 

ART  1  Orienting  Subsystem  Equations.  The  orienting  subsystem  compares 
the  STM  recalled  pattern  to  the  input  vector  and  becomes  active  when  the  comparison 
falls  below  a  vigilance  threshold.  The  vigilance  is  set  within  the  range  0.0  to  1.0  where 
1.0  indicates  that  the  STM  and  input  patterns  must  match  exactly  and  lower  values 
allow  propo^Goally  greater  dissimilarity  among  the  members  with  a  category.  Higher 
vigilance  will  result  in  greater  discrimination  anxMig  the  input  vecttns,  whereas  lower 
vigilance  will  tend  to  group  the  input  patterns  with  greater  variation  into  the  same 
category. 

When  a  mismatch  is  found,  the  orienting  subsystem  is  "aroused"  and  an  inhibitory 
signal  is  transmitted  to  F2  which  has  the  following  dynamics: 


^a;=G,(-a;+“sUy)) 


3.2.8-10 


where  a  is  the  arousal  level  for  the  inhibitory  signal  and  is  a  function  of  the  vigilance. 
When  the  arousal  is  greater  than  zero,  the  active  element  kj  transmits  a  self-inhibiting 
signal  until  it  is  zero,  at  which  time  the  inhibitory  signal  decays  at  a  time  scale  of 

G2*5t. 


When  die  orienting  subsystem  is  not  aroused,  an  excitory  activation  signal  is  sent  for  all 
categories  that  have  beoi  learned  by  the  network.  The  equation  for  this  process  is: 

r  r  ^  > 


K.P 


-5(a))  3.2.8-11 


where  p  is  the  vigilance,  P  is  the  average  initial  value  for  the  bottom-up  weights,  which 

is  leass  than  L2/(L2  - 1  -  N),  and  5(a)  =  1  when  a  >  0,  and  5(a)  =  0  otherwise.  This 
sends  a  self-excitory  signal  to  the  ah  <  ady  learned  categories  slighdy  above  the  avouge 

signal,  P,  of  the  unlearned  or  initialized  categories.  Therefore,  the  competion  is  biased 
to  the  learned  categories  for  initial  searches,  but  still  allows  new  categories  to  be 
generated  firom  the  uninitialized  weights.  The  effect  of  choosing  L2  large  is  to  bias  the 
netwmk  to  choose  unctnnmitted  category  elements  in  response  to  unfamilar  input 
patterns. 
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ART  2  iq;q;)lies  the  adt^tive  resonance  theory  developed  by  Carpenter  and  Grossberg  to 
input  vectors  whose  elements  can  vary  continuously  over  the  range  from  0.0  to  1.0. 
ART  2  is  an  example  of  unsupervised  learning  where  the  output  represents  the  category 
for  the  input  pattern  determined  by  the  network  itself.  No  target  vector  is  provided 
during  training.  ART2  is  also  an  example  of  adaptive  learning  in  that  when  new 
patterns  are  presented,  which  are  significantly  different  form  the  previously  learned 
patterns,  the  network  will  recognize  that  the  patterns  are  different  and  form  new 
categories. 

ART  2  Architecture.  The  architecture  for  ART2  is  shown  in  Figure  3-14.  The  sets 
of  nodes,  FI  and  F2,  are  connected  by  long  term  memory  (LTM)  weights.  An  orienting 
subsystem  O  contains  the  vigilance  parameter  that  resets  F2  if  the  recalled  pattern  is 
significantly  dissimilar  from  the  input  pattern.  A  gray  scale  input  pattern  activation  li 
enters  the  network  at  Fi,  which  eventually  sends  activation  signals  to  F2  via  bottom  up 
weights  Zij.  These  signals  compete  with  the  winning  categt^  in  vector  yy,  sending  a 
recalled  pattern  via  £e  top  down  weights  at  vector  p,-.  This  activation  vector,  along 
with  the  short  term  menxny  (STM)  activation,  Ui,  establishes  a  candidate  model  of  the 
input  pattern  at  r,-. 

This  expected  activation  vector  is  compared  against  the  vigilance  threshold  p  to 
determine  closeness  fit  If  the  expected  pattern  sufficiendy  matches  the  current  input 
the  STM  vector  is  modified  via  vector  vj,  and  the  system  resonate  for  the  recalled 
category.  If  the  expected  vector  is  dissimilar,  the  orienting  system  will  send  an 
inhibiuvy  signal  to  F2,  suppressing  the  winning  category  and  lowing  the  networic  to 
test  the  next  categcxy. 

The  ART  2  architecture  diagram  shows  the  activation  vectors  for  Fi  and  F2  and  their 
inteicoiuiecticxis.  The  black  dots  in  the  figure  indicate  tire  ^plication  of  a  non-specific 
excitation  bias  during  the  computation  of  the  activation  signal.  The  other  arrows 
indicate  the  inputs  to  the  vectors  representing  excitatory  activities  within  the  unit 
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Figure  3-13  ART2  Architecture  ' 

ART  2  STM  Equations:  FI.  The  STM  activity  Vi  of  the  ith  element  at  any  one  of 
the  Fi  pnx^ssing  stages  obeys  the  Un  Se^  membrane  equation: 

v,+(i  -fi  y<)/r-(c  +£)  v)j:  3.2.8-12 

for  i  =  1  ...M  where  i  is  the  number  of  elements  in  the  input  vector  and  is  the  total 
excitatory  input  to  the  ith  element,  and  Jr  is  the  total  inhibitOTy  input.  With  no  input 
signal,  the  activity  decays  to  0  with  a  relax  time  given  by  A.  Tlie  dimensionless 

parameter  e  is  the  ratio  between  the  STM  relaxation  time  and  die  LTM  relaxation  time, 
which  is  0  <  e  «  1.  By  setting  fi  =  C  =  0,  the  ART  2  activation  equation  has  the 
asymptotic  fonn,  where  e  approaches  0,  given  by: 


JL 


a+dj: 


3.2.8-13 


In  this  form  the  dimensionless  equations  characterizing  the  STM  activities,  p,-,  <7,-,  u,-, 
Vi,  wi  and  Xi  are  computed  at  Fi  as  follows: 


PrU,*'Lg(y}2, 


3.2.8-14 
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r 


3.2.8-15 

«,=-Vi 

3.2.8-16 

e+\v\ 

Vi=f(x,)+bri 

3.2.8-17 

Wi=Ii+aUi 

3.2.8-18 

II 

3.2.8-19 

where  II VII  doiotes  the  inhibitory  signal  form  the  other  elements  within  the  unit  V  given 
by: 


|V|=i  t(VrV,) 

‘  >  3.2.8-20 

and  where  yj  is  the  STM  activity  of  the  jth  node  in  F2.  The  nonlinear  signal  function 
f(x)  is  of  the  form: 


0 

\  if  jc>i? 


3.2.8-21 


which  is  piece-wise  linear.  The  above  activation  equations,  computed  in  the  order 
given,  will  result  in  a  dynamic  network  where  the  activations  are  shifted  into  the  next 
activadcxi  as  the  input  vectOT  persists  at  the  input 


ART  2  STM  Equations:  F2.  Initially,  F2  is  inactive  with  no  category  selected 
from  LTM.  At  this  stage  F2  is  represented  with  an  input  signal  from  FI  via  pi  which  is 
an  adapted  form  of  the  input  pattern.  The  signal  across  all  categories  in  LTM  is 
cennputed  concurrently  as: 


yr'Lp.z, 

i 


3.2.8-22 


for  y  s  1 ...  N  whoe  N  is  the  number  of  categories  or  storage  capacity  of  LTM.  When 
the  signal  passes  a  threshold  and  F2  makes  a  choice,  the  activation  vector  yy  is  passed 
throu^  a  gated  dipole  threshold  function  given  by: 
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d  if  Yj  =  max{y^  I  the  jth  Fj  node  has  not 
<  been  reset  on  the  current  trial} 

0  otherwise 


3.2.8-23 


This  signal  is  passed  back  to  Fi  via  the  top-down  weights,  zji,  modifying  the  Fi 
activity  at  pi  which  truces  to  the  following: 


Pr 


Ui 

Ui+dzji 


if  Fj  is  inactive 

if  the  Jth  Fj  node  is  active 


3.2.8-24 


This  activity  is  ccHubined  with  the  STM  activation,  u,-,  to  form  the  activation  r,-.  This  last 
vector  has  properties  essential  to  the  orienting  of  the  network. 

ART  2  Orienting  Subsystem  Equations.  The  activity  at  r,*  is  such  that  r,will 
attenq)t  to  model  the  input  pattern  wi A  a  pattern  recalled  firom  long  term  memory.  That 
is  to  say,  it  will  attempt  to  match  the  input  pattern  as  given  by  the  STM  vector  with  a 
pattern  which  the  system  has  previously  learned  and  straed  in  LTM.  The  match  may  not 
be  exact  The  equation  for  this  activity  is: 


Ui+cq 

^  3.2.8-25 

The  activation  vector  has  been  normalized  such  that  the  sum  of  the  square  difference, 
IIHI,  will  be  1  if  the  patterns  match  and  will  be  less  than  one  in  proportion  to  the 
dissimilarity  between  STM  and  LTM  patterns.  The  degree  that  patterns  are  allowed  to 

be  dis5amilar  befcxe  resetting  the  network  is  given  by  the  vigilance  parameter,  p.  The 
orienting  subsystem  will  reset  F2  if  the  following  ctm^tion  is  satisfied 


P 


>1 


3.2.8-26 


whoe  the  vigilance  parameter  p  is  set  between  0  and  1.  The  reset  causes  the  previously 
winning  node  in  yy  to  be  suppressed  and  then  another  pattern  to  be  recalled  from  F2  and 
submit^  to  Fi. 

When  the  above  conditicm  is  not  satisfied,  the  network  will  begin  to  resonate.  This  can 
be  observed  in  the  activity  of  rj.  At  first  r,-  will  q;)pear  as  a  superposition  of  p,-  and  u,-. 
but  then  graduaUy  becomes  a  modeled  versitm  of  the  input  vector.  It  highlights  those 
features  of  the  input  diat  categorize  it  in  LTM.  For  high  vigilance,  n  w^  l^ome  an 
accretive  fonn  of  the  input.  For  lower  vigilance  (i.e.,  more  patterns  within  e^h 
category),  specific  features  will  be  shown,  ’^s  ixKxiifies  the  Sl%f  activity,  Uj,  which 
will  show  an  idealized  versicxi  of  the  input  pattern. 

When  processing  input  patterns  in  the  presence  of  noise,  the  non-specific  parameter  e 
acts  as  an  excitation  bias  that  desensitizes  the  network  to  noise  fluctuations.  In  the 
computation  of  activation  signals,  e  is  the  rate  at  which  the  activation  decays  given  no 
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excitatoiy  or  inhibitor  input  to  the  vector.  In  the  computation  of  the  vigilance  ratio,  e 
biases  the  reset  to  a  hi^er  vigilance  threshold. 


ART  2  LTM  Equations.  When  the  network  resonates  for  a  time  scale  that  is 
long  compared  to  the  STM  settling  time,  the  LTM  weights  are  modiHed  by  the 
following  dyiuunic  equations; 


izn = g(y,lp,  -  z4 = d[p,  -  z4 = d{\-d)[^  -  Z„] 

= 8(y,iP,  -  z«l = c/[  A-  Z.] = d{\-d)[^  -  z„] 


Vy  ^  ~  0  ~Za~^ 


3.2.8- 27 

3.2.8- 28 

3.2.8- 29 


Adaptive  Resonance  Theory  -  Hierarchical  (ART  3> 

ART  3  completes  the  trilogy  of  adaptive  resonance  theory  network  architectures  by 
iiiq)lementing  hierarchical  search  wit^  multiple  ART  2  netwo-ks.  To  achieve  this,  the 
FI  and  F2  modules  in  the  ART  2  architecture  must  become  homologous  and  bi- 
directionaL  Note  diat  FI  and  F2  are  not  the  same  in  the  ART  2  architecture.  In  the  ART 
3  architecture,  the  F2  Field,  which  gives  the  category  encoding  of  the  input  pattern  at 
FI,  is  implemented  as  a  FI  STM  field.  This  means  that  the  input  pattern  to  F2  is 
homologous  to  the  input  pattern  to  FI.  As  a  result,  partial  compression  of  the  category 
encoding  is  introduced  (/RT  1  and  ART  2  implenKnt  maximal  compi^im  encoding) 
to  allow  for  multiple  wiimers  during  the  categorization  of  a  dynamic  input  pattern.  To 
allow  for  distributed  competion  between  FI  bottom-up  and  F2  top-down  retrieval,  a 
medium  term  memory  (NT^)  is  added  to  the  the  ART  2  architecture  that  models  the 
chemical  transmitters  within  biological  neural  systems.  The  MTM  is  longer  scale  to 
STM  but  sufficiently  shcnter  time  scale  to  LTM  to  allow  the  system  to  stablize  before 
learning.  These  changes  to  the  ART  2  allow  the  FI  STM  fields  to  be  cascaded  within  a 
multiple  layer  architecture  where  the  STM  represents  both  input  patterns  and  partially 
coQqtressed  categmies. 

ART  3  Architecture.  The  architecture  for  ART  3  is  shown  in  Figure  3-15.  The 
sets  of  nodes.  Fa,  Fb,  and  Fc  consist  of  the  three  layer  unit  architecture  used  in  FI  of 
the  ART  2  architecture.  The  output  from  Fa  is  input  to  Fb  widiout  adaptive  weights.  As 
a  result  the  number  of  elements  in  the  output  from  Fa  matches  the  number  of  elements 
in  the  input  of  Fb.  Adaptive  bottom-up  and  top-down  weights  connect  Fb  and  Fc.  This 
allows  tte  number  of  elements  into  Fc  to  differ  from  the  number  of  elements  from  Fb. 
The  orienting  subsystem  combines  the  STM  output  from  Fa  and  Fb  to  establish  the 
resmuuice  pattern  ri,  which  is  compared  to  die  vigilance  parameter,  r.  The  reset  signal  is 
sent  to  Fb  and  Fc  arid  the  to  the  adaptive  weights  connecting  these  modules. 


By  convention,  xi^  represents  the  input  (rf*  die  ith  element  in  Xth  layer  of  the  ath  STM 

field,  and  yi^  represents  the  output  from  the  ith  element  in  the  Xth  layer  of  the  ath 

STM  field.  The  pairing  of  xi®^  and  yi®^  defines  a  unit  layer  in  the  ART  3  STM  field. 
This  parallels  the  layering  found  in  FI  of  the  ART  2  FI  STM  field.  The  output  from  the 
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middle  layer  rqiresents  the  mtemal  STM  value  for  the  input  pattern  and  the  output  from 
the  third  layer  is  pattern  that  is  effected  by  the  combination  of  top-down  retrieved  and 
internal  STM  patterns. 


ESET 


Figure  3-15  ART  3  Architecture 

The  LTM  memory  traces  are  as  defined  in  the  ART  2  architecture.  However,  in  ART  3 
the  retrieved  bottom-up  and  top-down  si^ials  are  modulated  over  time  by  the  chemical 
transmitters  accumulating  at  each  ith  or  jth  node.  The  dynamics  of  the  transmitters  is 
given  by  the  equatitHis  for  presyntqitic  and  bounded  transmitters  at  each  node.  The 
presynapdc  transmitters  define  the  potential  for  generating  chemical  transmitters  from  a 
synapse  node  for  a  given  signal  and  weight.  The  bounded  transmitters  define  the 
resulting  transmitters  that  are  bound  at  the  receptive  syn^se  node.  This  results  in  a  post 
syn^tic  activatioo  that  represents  the  input  to  the  receiving  STM  layer.  As  a  result  of 
this  mechanism,  distribute  nodes  (i)  can  effect  the  state  at  a  receptive  node  (j)  over  a 
time  scale  that  is  long  compared  to  STM  dynamics  but  is  short  'm\h  respect  to  LTM 
adaption. 

The  new  ART  search  mechanism  has  a  number  of  useful  prc^ierties: 

a.  woiks  well  for  mismatch,  reinforcement,or  input  reset; 

b.  is  simple; 
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c .  is  homologous  to  physiological  processes; 


d.  fits  naturally  into  network  hierarchies  with  distributed  codes  and  slow  or  fast 
learning; 

e.  is  robust  in  that  it  does  not  require  precise  parameter  choices,  timing,  or  analysis  of 
classes  of  inputs; 

f .  requires  no  new  anatomy,  such  as  new  wiring  or  nodes,  beyond  what  is  already 
present  in  the  ART  2  architecture; 

g .  brings  new  computatitMial  power  to  the  ART  systems; 

h.  although  derived  for  the  ART  ;.ystem  can  be  used  to  search  other  neural  network 
architectures  as  well. 


ART  3  STM  Equations.  Except  during  reset,  equations  used  to  generate  the 
STM  values  are  similar  to  ART  2  equations.  Dynamics  of  the  Helds  Fa,  Fb,  and  Fc  are 
homologous.  The  steady-state  variables  for  the  fields,  when  reset  signal  equals  0,  are 
given  by  the  following: 


Input  variable.  The  input  value  for  the  ith  unit  in  the  Xth  layer  of  the  ath  field  is 
defined  by  the  dynamics: 


_  “  oA  a*  ,  cic 


cU 


aa-t) 


a*.  „a(X*l) 


+A  Si 


3.2.8-30 


In  steady  state. 


3.2.8-31 


where  1  is  1, 2,  or  3  and  a  is  field  a,  b,  c.  If  1  - 1  =  0,  then  a  is  the  next  field  down 
from  a,  and  1  =  3.  When  1  -i-l  —  4,  a  is  the  next  field  up  fiom  a  and  1  =  1. 

Output  variable.  The  output  variable  is  a  normalization  of  the  corre^nding  input 
variable.  The  output  variable  corresponding  to  the  input  variable  for  the  ith  unit  in  the 

Aith  layer  of  the  octh  field  is  defined  by  the  dynamics: 


cA 


Jk. 


3.2.8-32 


where  the  interfield  input  signals  are  given  by  the  non-linear  signal  function: 

Sr  =«“()'■') 

and  the  tq>-down  intrafield  signal  across  the  adaptive  synapse  is: 
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(a-t-Oa 

ji 


3.2.8-34 


The  normalization  is  the  Euclidean  norm  or  L2  norm  used  in  ART  2  (eq.  3.2.8-).  This 
supports  the  orderly  pattern  transformadons  under  a  variable  processing  load  and  direct 
access  to  learned  category  representations  without  searching  in  LTM. 

ART  3  Signal  Functions.  The  ART  3  signal  functions  have  two  forms  depending 
upon  whether  the  matching  is  distributed  (partially  compressed)  or  choice  (maximally 
compressed)  encodings  of  ^e  output  variable; 

Distributed 


8‘‘iw)  = 


0  if  w^p°  +p“ 


Choice 


g®(w)  = 


if  w>  p“  + 

ll  Pt  J 

f  0 

ifw^pfl 

if  w  >  I 

ll  Pi  J 

J 

3.2.8-35 


3.2.8-36 


In  the  case  of  choice  the  p7a  parameter  is  dependent  upon  the  number  of  categories  by: 

P?  ~  ^  resulting  output  signal  will  reflect  only  me  choice  regardless  of 

the  number  of  choices.  Otherwise  the  parameters  p7  and  pg  are  non-negative  (0.0  - 

1.0). 

ART  3  Transmitter  Equations.  When  the  reset  sig^  equals  0,  the  levels  of 
presynaptic  and  bound  transmitter  are  governed  by  the  following  equations: 

Presynaptic  transmitter,  Fb  ->  Fc 


d  he  (  be  be\  6e  _ 

~\Zii  Uv  ]  Uij  P 

3.2.8-37 

Bound  transmitter,  Fb  ->  Fc 

d  be  be  be  cl 

^Vy  =-Vv  -Mv  P\Xi 

3.2.8-38 

Presynaptic  transmitter,  Fc  ->  Fb 

d  cb  (  eb  cb\  cb  _ 

-^Uil=(z,-U,)-UhP 

3.2.8-39 

Bound  transmitter,  Fc  •>  Fb 
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3.2.8-40 


ART  3  Reset  Equations.  Reset  occurs  when  patterns  active  at  Fa  and  Fb  fail  to 
match  acceding  to  the  criterion  set  by  the  vigilance  parameter.  The  reset  unit  for  the  ith 
node  is 


a2  b2 


P3  + 


PFFI 


Reset  of  the  Fb  and  Fc  fields  occurs  if 


3.2.8-41 


|r*|<p‘  3.2.8-42 

where 

0<p‘<l  3.2.8-43 

The  effect  of  a  large  reset  signal  is  approximated  by  setting  input  varables  and  bound 
transmitter  variables  equal  to  0. 

3.2.9  Brain  State  in  a  Box  Function  Blocks  (CSC09) 

CSC  Purpose.  This  CSC  provides  custom  function  blocks  for  implementing  and 
training  Brain  State  in  a  Box  03SB)  neural  networks  within  embedded  system  designs. 
The  CSC  includes  custom  coded  function  blocks  for  unsupervised  learning  and  recall 
processing  algorithms.  It  also  provides  custom  user  function  blocks  which  allow  ART 
networks  to  p^orm  various  functions  within  a  communications  design. 

Execution  and  control  data  flow.  The  execution  and  ctxitrol  data  flow  is  defined  by  the 
propagation  architecture  of  Brain  State  in  a  Box  (BSB).  The  following  defines  the 
overall  architecture  to  be  implemented  by  this  CSC. 

BSB:  Linear  Associator.  The  BSB  Network  is  an  example  of  associative 
memory ,  which  has  been  extensively  investigated  by  HopEeld  and  Kosko.  It  is  based 
upon  the  properties  of  a  linear  associator  using  a  generalized  Hebb  learning  rule.  The 
output,  yj,  from  the  linear  associator  is  generated  from  the  weighted  input,  x,,  as 
follows; 


yr'Lw.x,  3.2.9-1 

a 

where  i  s  1  ...  N  and  j  =  1  ...  M  where  N  and  M  are  the  number  of  input  and  output 
elements  respectively.  The  associative  memory  weights,  w,y,  are  adjust^  according  to 
a  generalized  Hebb  mle  with  each  input/output  pair,  k,  given  by; 

~  3.2.9-2 
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where  yk  and  JCj^  are  the  kth  associative  learning  example.  The  associative  weights  are 
the  sums  of  the  output  vector  products. 


kMi 


3.2.9-3 


where  ti  is  a  learning  constant. 

To  illustrate  the  pit^)erties  of  this  netwoiic,  define  the  input  as  a  combinadon  of  a  mean, 
p,  combined  with  a  non-linear  distortion,  d^,  as  follows; 

x\  =  P,+d'‘i  3.2.9-4 
Substituting  this  into  equation  3.2.9- 3,  the  associative  weights  are: 


w,  =  rity,x‘  =  ny\  np, + 'td- 

*-t  V  > 


3.2.9-5 


where  n  is  the  number  of  training  examples,  x^,  and  y  is  the  corresponding  output  state 
associated  with  these  examples.  If  it  is  assumed  that  the  sum  of  a  zero  mean 
error,  then  the  associative  weights,  relate  the  output,  yj,  with  the  mean  input,  Pi,  given 
by: 


W>i=vny.p.  3.2.9-6 

Even  though  none  of  the  examples  explicitly  provide  the  mean,  the  network  is  able  to 
extract  the  mean  input  even  if  the  distortion  is  non-gaussian  or  non-linear. 

BSB  Architecture.  Hgure  3- 16  gives  the  architecture  for  a  general  application  of 
the  above  linear  associator.  The  outputs  for  the  BSB  processing  elements  are  fully 
connected  to  one  another,  such  that  the  output  state  Sj  of  the  jth  unit  at  time  t  is  given 
by: 


J,(0+ 

i  '  3.2.9-7 

The  first  term,  passes  the  current  system  stare,  through  the  weight  connections  to 
generate  the  associative  state,  yi  at  the  current  time  step.  The  second  term  causes  the 
current  ou^t  state  to  decay  slightly.  This  has  the  effect  of  forcing  the  distortion  to  zero 
over  multiple  interactions.  The  third  term  keeps  the  initial  information  constantly 
present  and  has  the  effect  of  limiting  the  flexibility  of  the  possible  states  of  the 
dynamical  ^stem  since  some  vector  elements  are  strongly  bias^  by  the  initial  input 
llie  output  is  passed  through  a  non-linear  limit  function  to  generate  the  next  stare  of  the 
system: 


3.2.9-8 
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The  purpose  of  the  limit  function,  f(),  is  to  maintain  the  output  state  within  limits  for 
multiple  interactions  of  feedback  through  the  network.  The  ou^ut  is  not  allowed  to 
exceed  the  positive  limit  or  be  less  than  the  negative  limit  The  limiting  process  contains 
the  state  vecttx*  for  the  dynamical  system,  hence  the  designation,  brain  state  in  a  box. 

The  dynamical  system  is  free  to  move  within  the  boxed  limits  established  by  the 
netwo^  For  multiple  iterations,  the  state  fluctuates  until  it  settles  to  a  stable  state  for 
the  given  initial  input  vector.  For  auto-associative,  symmetrical  weigiits  the  final  state  is 
associated  with  the  minimum  energy  state  of  the  network.  By  changing  the  input  state, 
the  system  will  move  to  another  attract<H‘  which  has  been  learned  by  the  network. 

Output  Activation 


Auto 

Associator 

Elements 


Hgure  3-16  Brain  State  in  a  Box  Architecture 

BSB:  Error  Correction  Equation.  By  using  a  Widrow-Hoff  error  correcting 
procedure,  the  associative  weights  are  incrementally  adjusted  by  Aw  givoi  by: 

AWii  =  ^y]  -  Zw*  Jcf  jx*  3.2.9-9 

This  error  correcting  procedure  will  have  the  accumulative  effect  of  forcing  error  to  zero 
so  that  the  weights  approximate  a  least  means  squares  mapping  of  the  input  to  the 
ouq)ut  vector. 
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4 .  NEURAL  NETWORK  COMMUNICATIONS  SYSTEM 
APPLICATIONS 

This  sectim  summarizes  the  results  of  simulating  Neural  Network  t^plications  in 
Communications  Signal  Processing.  The  simul^on  objectives  were: 

1 .  Demonstrate  a  Neural  Netwmk  simulation  capability  for  the  study  of  Neural  Network 
applications  to  communications  systems  and  signal  processing. 

2.  Investigate  how  selected  areas  of  communications  can  benefit  from  Neural  Network 
technology  via  the  developed  simdadon  capability. 

3.  Use  the  simulation  to  identify  neural  network  configurations  to  be  included  in  the 
conceptual  design  of  a  phase  n  neural  network  transceiver. 

In  the  following  sunamary  of  selected  applications  of  Neural  Network  technology,  each 
simulation  is  summarized  using  the  following  fmmat* 

IntroductitMi-  A  brief  statement  to  preface  the  simulation  which  follows. 

Overview-  A  description  of  the  problem,  including  a  block  diagram  of 

the  system  and  background  i^ormation  of  certain  system 
components. 

Simulaticm  Parameters-  Specific  parameters  used  by  the  Neural  Network 

configuration. 


Paradigm 

Neural  Network  aichitecture/learmng  algorithm  used 

Size  of  die  input  vector 

Hidden  Nodes 

Number  of  internal  Neural  Processing  nodes 

IH9SBH 

Number  of  output  nodes  (the  size  of  the  output  vector) 

LeamRate 

Degree  to  which  die  error  signal  is  applied  to  weight  changes 

Nfomentum 

Degree  to  which  previous  weight  changes  influence  future  weight 
changes 

miBi 

Number  of  training  vectors  between  weight  updates 

Number  of 
passes 

Number  of  times  that  the  training  set  was  applied  during  learning 

Training  Size 

The  total  training  size,  giving  the  total  number  of  input  vectors 
applied  during  learning 

SPW  Iteratkms 

pervectOT 

The  number  of  simulation  iterations  to  produce  a  single  input 
vector 
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Results- 


Observadons  and  data  collected  which  displays  the  results  of  the 
Neural  Network,  including  performance  curves  and  signal  plots. 


Lessons  Learned- 


Potential  Extension- 


Procedures- 


Expeiience  gained  in  the  efftnt  to  apply  the  Neural  Network 
technology  to  the  problem  which  may  be  useful  in  other  Neural 
Network  applications.  (Included  when  applicable.) 

A  discussion  of  areas  of  future  research  for  Neural  Network 
applicadons  related  to  the  current  problem.  (Included  when 
notewordiy.) 

Procedures  to  execute  the  simulations  described  in  Section  4  are 
documented  in  the  Appendix. 
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4.1  SIMPLE  NON-LINEAR  MAPPING  WITH  A  NEURAL 
NETWORK 

Application 

Non-linear  Mapping  by  a  Baclq)ropagation  Neural  Netwoik 
Introduction 

Backpropagation  Neural  Networks  have  been  shown  to  be  capable  of  emulating  continuous 
functions  and  discrete  mappings  fipcHn  one  input  vector  to  another.  In  an  initial  experiment, 
we  investigated  the  capability  of  Backpropagation  networks  to  learn  a  simple  non-linear 
mapping  such  as  the  square  function. 

Overview 

This  problem  was  studied  using  the  block  diagram  below: 


A  training  set  was  created  consisting  of  21  real  numbers  between  -1.0  and  1.0,  inclusive, 
spaced  by  a  distance  of  0. 1.  The  target  consisted  of  the  input  value  squared,  multiplied  by 
a  facttv  of  O.S.  The  factor  0.5  was  chosen  because  the  output  of  Ae  Backpropagation 
netwoik  produces  values  ranging  from 

-0.5  to  0.5  (which  is  a  result  of  the  sigmoid  non-linear  transfer  function  in  each  output 
node).  An  input  value  and  target  value  were  presented  to  die  Backpropagatim  Networic  at 
each  iteradiMi.  Weights  were  updated  in  batch  mode  after  each  pass  through  the  training 
set  In  a  second  experiment  the  neural  netwoik  was  trained  to  learn  the  0.5x3  function. 

Simulation  Parameters 


Paradigm 

Backpn^gation 

Input  Nodes 

1 

Hidden  Nodes 

9 

Output  Nodes 

1 

Learn  Rate 

1.5 
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Momentum 


Update  Interval 


Number  of 
Passes 


9996 


1 


Results 

Hgure  4.1-1  displays  two  X  vs  Y  plots  where  the  neural  network  is  trained  to  learn  the 
O-Sx^  function.  Plot  A1  shows  the  target  value  (votical  axis)  fw  each  coiresponding  input 
value  (horizontal  axis)  in  the  training  set  Plot  A2  shows  the  mapping  created  by  the  neural 
netwo^  after  training  has  con:g>leted. 
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Figure  4.1-1  Neural  Network  Target  and  Output  fm*  O.Sx^  Function 

Figure  4.1-2  di^lays  similar  plots  for  the  0.5x3  function.  Further  training  with  a  lower 
learning  rate  would  increase  the  precision  of  the  ixu4)pings. 
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The  neural  network  has  been  trained  to  learn  a  function  by  providing  it  with  only  21  pairs 
of  input  and  target  values  over  the  range  from  -1.0  to  1.0.  To  examine  the  abiUty  of  the 
network  to  generalize,  i.e.,  perform  the  desired  mapping  for  inputs  for  which  it  has  not 
been  trained,  we  held  the  weights  at  their  final  values  and  added  noise  to  the  input  values, 
thus  approximating  a  continum  of  inputs.  Figure  4.1-3  shows  that  the  network  can  indeed 
generalize,  which  in  this  case  means  specifically  that  the  neural  network  effects  a  smooth 
interpolation  between  the  traii^  points.  However,  the  A2  plot  in  Figure  4.1-3  suggests 
that  the  generalization  is  not  valid  for  values  of  x  outside  the  interval  from  -1  to  1.  Note 
how  the  A2  plot  in  Figure  4.1-3  reveals  more  effectively  than  A1  the  incomplete 
convergence  to  the  square  function,  even  though  the  A2  plot  overlays  the  A1  plot 

Lessons  Learned 

Upon  initialization  of  the  Backpropagation  algorithm,  weights  are  initialized  randomly 
between  an  upper  and  lower  bound.  As  a  default  -0.3  and  40.3  are  used.  These  bounds 
produced  inferior  results.  For  the  square  function,  training  time  was  significantly  longer. 
The  mapping  remained  linear  for  about  400  passes  through  the  training  set  before 
significant  convergence  began.  For  the  0.5x^  function,  a  local  minimum  was  found  and 
the  cmrect  maping  was  never  reached.  Upon  changing  the  initial  bounds  to  -4.0  and  4.0, 
the  network  quickly  converged  to  a  good  solutioiL 
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Hgure  4. 1-2  Neural  Network  Target  and  Output  for  0.5x3  Function 
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Figure  4.1-3  Trained  and  Generalized  Ou^uts  for  the  0.5x2  Target  Function 
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4.2  EQUALIZATION  OF  MULTIPATH  DISTORTED  64-QAM 
Application 

Equalization  of  Multipath  Distorted  64-QAM  using  a  Backpropagation  Neural  Network 

Introduction 

The  use  of  Neural  Networks  in  the  equalization  of  signals  exhibiting  intersymbol 
interference  (ISI)  was  studied  using  the  block  diagram  below: 


Overview 

A  transmitter  generating  random  64-QAM  symbols  at  a  sampling  rate  of  20  samples/symbol 
was  used  as  the  system  input  A  25^tap  raised-cosine  rolloff  fflter  with  a  rolloff  factor  of 
0.5  was  used  for  pulse  shaping  at  the  transmitter  side.  The  channel  model  used  was  a 
mult4)adi  Rummler  model  consisting  of  the  primary  signal  summed  with  a  delayed,  rotated, 
and  attenuated  version  of  the  primary  signal.  Additive  white  Gaussian  noise  (AWGN)  with 
a  variance  of  0.005  was  add^  to  the  multipath  signal  to  complete  the  channel  model.  A 
256-tap  raised-cosine  rolloff  filter  with  a  rolloff  factor  of  0.5  was  also  used  for  pulse 
shaping  at  the  receiver  side,  fulfilling  Nyquist's  criteria  for  minimizing  ISI.  The  ISI- 
distoned  sign^  (dftex  receiver  pulse-shaping)  is  input  to  a  Linear  Adaptive  Equalizer  and  a 
Baclqr^agation  Neural  Network  for  comparison.  The  channel  model  causes  ISI  to  occur, 
necessitating  the  use  of  equalization  for  proper  demodulation.  In  the  Backpropagation 
Neural  Netwmk  and  the  linear  equalizer,  one  in-phase  sample  and  one  quadrature-phase 
sample  of  each  of  the  last  16  received  symbols  are  used  as  input,  hence  32  inputs.  The 
output  layer  produces  an  in-phase  and  quadrature  pair  which  represents  the  equalized  signal 
at  die  optimum  sa£q>ling  time  for  donodulation. 

Rummler  Channel 

The  Rummler  chaimel  model  consists  of  the  primary  signal  summed  with  a  delayed, 
rotated,  and  attenuated  version  of  the  primary  signal.  Ihe  input  to  output  relationship  for 
this  channel  is: 

y(t)  =  x(t)  -  Px(t-T)cj2jrtfQ 
where  x(t)  is  the  Rummler  channel  input 

y(t)  is  the  Rummler  channel  output 

P  is  the  gain  of  the  reflected  signal 
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X  is  the  time  delay  between  primaiy  and  reflected  signals 
^  is  the  null  frequency 

Taking  a  Fourier  transform  of  the  Rummler  channel  in^ulse  response  gives  a  spectrum 
with  severe  attenuation  for  certain  frequencies,  called  nulls  (see  Figure  4,2-1).  The 
Rummler  channel  model  lets  the  user  spet^  these  frequencies,  llie  null  frequency  used  in 
this  system  is  4  MHz.  The  sampling  fi^uency  is  320  MHz.  The  baud  rate  is  16  MHz. 


Figure  4.2-1  Rummler  Channel  Frequency  Response 


Raised  Cosine  Rolloff  Filters 


The  raised  cosine  rolloff  filter  is  a  family  of  lowpass  filters  which  assist  in  satisfying 
Nyquist's  First  Method  for  eliminating  Intersymbol  Interference  (ISI).  Basically,  the  goal 
is  to  have  a  system  with  equivalent  transfer  function 

He(f)  =  H(f)Htx(OHc(f)H„(f). 

where  H(f)  is  the  frequency  spectrum  of  the  noodulated  signal  before  fUtering 

Hoc(0  is  the  transfer  function  of  the  Transmitting  filter 

H(;(f)  is  the  transfer  function  of  the  Channel 

I^(f)  is  the  transfer  function  of  the  Receiving  filter 

such  that  the  corresponding  system  impulse  response,  h^ft),  is  zero  at  all  sampling  times 
other  than  the  time  associated  with  the  transmitted  symbol  at  that  instant: 


hcOcTg) 

=  C 

for  k=0. 

=  0 

for  knot  equal  to  0 

where 

T^  is  the  san::q)ling  period 

C  is  the  value  of  the  transmitted  symbol 

If  this  is  true,  the  adjacent  symbols  in  a  stream  will  not  bleed  into  each  other  at  the 
samjMng  timest  which  are  tte  only  times  of  interest  Choosing  a  raised  cosine  rolloff 
filter  for  Htx(f)  and  Hrx(f)  with  a  well-behaved  channel  will  satisfy  Nyquist's  First 
Method.  However,  the  channel  may  not  be  well-behaved  ot  it  may  even  be  varying.  An 
equalizer  {at  Neural  Networit)  can  be  used  to  compensate  for  the  channel  behavior  in  order 
to  result  in  a  total  system  transfer  function,  satisfying  the  above  criteria 

This  Raised  Cosine  Rolloff  Filter  block  performs  frequency-domain  filtering.  This  routine 
is  typically  much  faster  than  the  time  domain  implementation  which  uses  a  linear 
convolution  method.  The  generated  raised-cosine  frequency  response  can  be  either  a 
complete  raised-cosine  or  a  square-root  raised-cosine  ^ter  depending  on  a  parameter. 
Also  another  string  parameter  is  provided  if  a  complete  (or  square-root)  raised-cosine 
cascaded  with  an  inverse  sine  function  is  required.  The  actual  filtering  is  implemented  in 
the  following  steps:  The  input  vector  signal  is  zero  padded  so  that  its  length  is  twice  as 
large.  That  is,  ^e  length  of  the  zero-padded  input  signal  is  equal  to  the  number  of 
interpolatitm  points  (or  FIR  tap  length).  The  cmxq)lex  FFT  of  this  signal  is  then  taken,  and 
the  resulting  vector  weighted  by  the  fi^ue^  domain  description  of  the  filter  which  was 
calculated  during  initializatimi.  Then  the  inverse  FFT  is  taken  to  convCTt  the  resulting 
sequence  back  to  the  time  domain  The  output  is  then  delayed  and  summed  to  realize  the 
"overiap  and  add"  conq)utation  of  the  ouq)uL 
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Network  Parameters 


Paradigm 

Backpropagation 

Input  Nodes 

32 

Hidden  Nodes 

2 

Output  Nodes 

2 

Learn  Rate 

0.3 

Momentum 

0.9 

Update  Interval 

20* 

Number  of 
Passes 

2500 

Training  Size 

50K 

SPW 

Interations  per 
vector 

20 

^The  training  set  consisted  of  a  continuous 
stream  of  random  64  QAM  received  symbols 
and  the  coneqwnding  target  symbols. 
Weights  were  iqxlated  at  the  end  of  every  20 
input  vectors,  llie  I  and  Q  values  for  the  last 
16  received  symbols  constituted  the  input 
vector. 


Results 

In  Figure  4.2-2  Signal  A1  represents  the  ctxistellation  for  the  equalized,  unquantized  signal 
at  the  output  of  the  linear  equalizer.  Signal  A2  shows  the  same  for  the  Neural  Network. 
Signal  A3  displays  the  rotated,  ISI  distorted  signal  constellation  appearing  at  the  input  to 
the  Neural  Network  and  the  equalizer.  Both  the  Neural  Network  and  the  linear  equalizer 
were  able  compensate  for  the  majority  of  the  distorting  effects  of  the  multipath  channel. 

Both  the  Neural  Network  and  the  linear  adaptive  equalizer  exhibit  the  following 
functionality: 

1 .  Both  can  correct  intersymbol  interference  with  “multipath  delay”  on  the  order  of  several 
symbols  (e.g.  30  saiiq)les)  when  given  a  training  signal.  We  are  using  a  Rummler  channel 
which  consists  of  a  primary  signtd  component  summed  with  a  scaled,  rotated,  and  delayed 
secondary  component  The  amount  of  this  delay  in  time  is  what  we  are  specifying  when 
we  refer  to  the  “multipath  delay”.  This  delay  is  specified  to  be  some  number  of  s^ples. 
To  determine  a  delay  in  seconds,  you  must  know  the  sampling  rate,  fj.  Using  our 
terminology,  one  sample  of  multipath  delay  equals  seconds. 

2.  Both  can  correct  ISI  with  multipath  delay  on  the  order  of  a  few  samples  (e.g.  2 
samples)  from  start-up  using  hard  decision  feedback  instead  of  a  training  signal,  given  a 
suitable  initial  weight  setting.  For  the  linear  equalizer  this  suitable  initial  weight  setting  was 
to  initialize  all  taps  to  0.0  except  for  the  center  lap  which  was  set  to  1.0.  For  the  Neural 
Network,  the  initial  weights  were  set  to  those  resulting  from  training  to  a  zero  delay 
multipath  situatitxi. 
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3 .  Both  can  continue  to  conect  ISI  with  either  a  fixed  multipath  delay  or  a  slowly  varying 
multipath  delay  using  hard  decision  feedback  instead  of  a  training  signal,  given  prior 
convergence.  Here,  slowly  varying  means  step  increases  (or  decreases)  of  1  sample  of 
multip^  separated  by  sufficient  time  to  allow  for  re-convergence. 

4.  Both  will  not  correct  ISI  for  a  rapidly  changing  multipath  delay.  Here,  rapidly 
changing  multipath  delay  means  step  increases  (or  decreases)  of  5  or  more  samples  . 

Lessons  Learned 

Scaling  of  signals  is  very  important  for  the  Neural  Netwcn-k.  Input  and  target  signals  must 
be  scaled  appropriately  to  remain  within  the  bounds  of  the  sigmoid.  Improper  scaling  may 
cause  saturation  of  the  sigmoids  which  in  turn  will  cause  training  to  virtually  cease 
(network  paralysis)  and/or  output  levels  to  be  clipped  at  the  upper  and  lower  bounds  of  the 
sigmoid. 

Synchronization  and  timing  in  SPW  are  also  very  important  Careful  synchronization 
between  input  signals  and  corresponding  target  signals  is  required.  An  external  clock  is 
used  to  control  the  clocking  of  samples  into  Ae  network.  Only  the  center  sample  of  each 
symbol  is  input  to  the  Neural  Network.  Weights  are  updated  once  per  symbol. 

Convergence  time  for  the  Neural  Network  requires  about  50  times  as  many  iterations  as  the 
linear  adaptive  equalizer. 
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Figure  4.2-2  Signal  Constellations  for  64-QAM  Equalization 
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4.3  EQUALIZATION  OF  DYNAMIC  MULTIPATH  DISTORTION 
Application 

Use  of  a  Backpiopagation  Neural  Netwok  to  Control  a  Bank  of  Equalizers  in  a  Dynamic 
Multipath  Environment 

Introduction 

The  simulation  of  the  multipath-distorted  64  QAM  system  previously  discussed  showed 
that  a  Baclq)ropagation  Neural  Network  could  equalize  a  rotated  constellation,  and  also 
adapt  to  "slow"  ctuuiges  of  the  multipath  delay  in  an  unsupervised  scenario  (using  its  own 
output  as  a  target),  llus  was  true  of  the  Linear  Adaptive  Equalizer  as  well.  However,  if 
the  multipath  delay  were  to  change  suddenly  and  to  a  "large"  degree,  both  the  Neural 
Network  equalizer  and  the  linear  ad^tive  equalize'  would  not  be  able  to  keep  up  and  would 
require  a  training  signal  to  once  again  cancel  the  effects  of  multipath.  This  section  is  a 
summary  of  the  use  of  N  equalizers  (neural  or  linear  ad^tive),  where  each  has  been  trained 
(and  have  their  weights  fixed)  to  a  different  amount  of  multipath  delay,  to  create  a  structure 
which  will  be  able  to  handle  a  large  range  of  dynamic  multipath  distortion. 

Overview 

The  use  of  Neural  Networks  in  the  equalization  of  signals  exhibiting  intersymbol 
interference  (ISI)  as  a  result  of  a  dynamic  multipath  distortion  was  studied  using  a  structure 
described  by  the  block  diagram  below. 


A  transmitter  genoating  random  64-QAM  symbols  was  simulated  using  a  sampling  rate  of 
20  samples/symbol.  A  2S6-tap  raised-cosine  rolloff  Biter  with  a  rolloff  factor  of  0.5  was 
used  for  pitise  shaping  on  the  transmitter  sitte.  The  charmel  model  used  was  a  multipath 
Rummler  model  consisting  of  the  primary  sigtud  summed  with  a  delayed,  rotated,  and 
attenuated  version  of  the  primary  signal.  The  input  to  output  relationship  for  this  channel 
is: 


y(t)  =  x(t)  -  px(t-t)cj2’WfQ 
where  x(t)  is  the  Rummlo’chaiuiel  input 
y(t)  is  the  Rummler  channel  output 
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P  is  the  gain  of  the  reflected  signal 

t  is  the  time  delay  between  piimaiy  and  reflected  signals 
fO  is  the  null  frequency 


The  dme  delay  (x)  is  varied  to  represent  a  dynamic  multipath  channel. 

Additive  white  Gaussian  noise  with  a  variance  of  0.005  was  added  to  the  muldpath  signal 
to  complete  the  channel  ixKxlel. 

A  256-tap  raised-cosine  rolloff  filter  with  a  rolloff  factor  of  0.5  was  also  used  for  pulse 
shaping  on  the  receiver  side.  The  channel  model  causes  ISI  to  occur,  necessitating  the  use 
of  equ^zadon  fOT  proper  demoduladon. 

We  are  using  a  Rummler  channel  which  con^ts  of  a  primary  signal  component  summed 
with  a  scaled,  rotated,  and  delayed  secondary  component.  In  some  pracdcal  cases,  the 
delay  of  tide  secondary  path  reladve  to  the  primary  path  is  a  fixed,  constant  value.  For 
example,  a  Line  of  Sight  microwave  link  may  have  a  secondary  path  due  to  the  refiecdon 
off  a  nearby  building.  If  the  transmitter  and  receiver  are  stadonary,  the  reladve  delay 
between  the  signal  components  due  to  the  refiecdon  is  constant  In  other  situadons,  the 
reladve  delay  may  be  dynamic  due  to  something  in  the  geometry  of  the  system  being  in 
modon.  In  practice,  situations  like  this  can  idso  occur  when  changing  atmospheric 
condidons  alter  the  path  of  one  (or  both)  of  the  two  main  signal  components  thereby 
affecting  the  time  delay  between  them.  "Slowly"  and  "quickly"  varying  multipath  are 
qualitative  terms  we  us^  to  gauge  how  fast  the  relative  delay  is  chan^g.  Here,  "slowly 
varying"  means  step  increases  (or  decreases)  of  (me  san^le  of  muldpaA  delay  separated  by 
sufficient  time  to  allow  for  re-convergence.  "Quickly  varying"  means  step  increases  of 
more  than  one  sample  of  multipath  delay.  As  before,  we  equate  one  sample  of  multipath 
delay  to  a  relative  delay  between  the  signal  ctxnponents  oS  l/f^  sectmds. 

The  Backpropagation  Neural  Network  is  used  to  decide  which  of  the  fixed  equalizers  is 
best  at  equalising  the  multipath  distortion  (in  effect  estimating  the  value  of  the  dynamically 
varying  multipaA  delay).  It  makes  decisions  based  on  a  function  of  the  last  thrre  complex 
symbols  ouq)ut  from  each  equalizer.  This  yields  an  input  vector  of  size  18:  2  elements  per 
complex  symbol  for  die  last  3  symbols  for  e^h  of  3  equalizers.  To  expedite  training,  we 
use  the  absolute  value  of  dm  difference  between  the  complex  symbol  ou^ut  of  an  equalizer 
and  the  nearest  64  QAM  symbol  level  as  the  inputs  to  the  Neural  Netwcmk.  The  output  of 
the  Neural  Network  is  a  vector  identif^g  which  of  the  equalizers  is  correcdy  cancelling 
the  effects  of  multipath  at  the  current  time.  As  the  multipath  channel  varies  dynamically, 
the  Neural  Network  will  make  decisions  dynamically  as  to  the  proper  equalizer  to  use  for 
signal  demodulation.  This  structure  can  be  expanded  to  include  more  equalizers.  Such  a 
structure  would  be  more  robust  and  be  able  to  compensate  for  a  wider  range  of  multipath 
fluctuation  and  in  a  faster  maimer  compared  to  conventional  methods. 

Linear  Adaptive  Equalizers 

Several  16-tap  linear  adaptive  equalizers  are  connected  to  the  output  of  the  raised  cosine 
filter  at  the  receiver  side.  Each  of  these  equalizers  have  been  trained  (and  have  their 
weights  fixed)  to  a  different  amount  of  muldpath.  We  have  shown  that  Backpropagation 
Neural  Networks  are  capable  of  performing  linear  equalization,  and  could  b«  used  here 
instead  of  the  linear  adaptive  equdizers.  We  use  the  linear  adaptive  equalizers  simply  for 
convenience  in  this  simulation. 


64 


The  SPW  equalizer  block  implements  a  minimum-mean-square  error  linear  adaptive 
equalizer  for  QAM.  It  has  an  equalizer  input,  training  sequence  input,  quantized  QAM 
ouq)ut,  unquantized  QAM  output,  and  the  tap  weights  output  Parameters  include  the 
number  of  ttq>s,  first  angle,  QAM  order,  feedback  gain.  The  taps  are  separated  by  one 
symbol  intervid  in  time.  The  sample  at  symbol  center  is  used  to  update  the  taps,  i.e.  this  is 
assumed  to  be  the  point  at  which  the  eye  is  naost  qpen,  and  will  be  the  point  forced  open  by 
the  equalizer.  The  feedback  gain  constant  should  be  made  smaller  as  the  number  of  taps 
increases.  If  it  is  too  large  fen*  the  number  of  taps,  then  the  equalizer  may  not  converge.  A 
contrd  signal  chooses  either  the  decision  feedback  or  the  reference  (training)  signal  for  the 
feedback  lo(^. 

Generation  of  Training  Data 

Figure  2  depicts  the  SPW  system  which  provides  the  input  data  for  the  Neural  Network. 
Initially,  each  of  the  linear  equaUzers  are  ttemselves  trained  to  a  specific  multipath  delay  (0, 
2,  and  4  samples  of  the  64  QAM  signal).  convergence  is  reached,  the  ^ualizer  taps 
are  fixed.  At  this  point,  a  random  numl^  generator  is  used  to  specify  a  multq)ath  delay  of 
either  0,  2,  or  4  samples.  The  resulting  ISI-distorted  signal  is  processed  by  each  of  the 
equalizers.  For  each  received  symbol,  a  vector  of  length  18  (as  previously  described)  is 
written  to  disk. 


In  addition,  a  vecttv  of  length  3  (vO,  vl,  v2)  is  written  to  disk  representing  the  target 
vector.  For  a  given  multipath  delay  (0, 2,  or  4  samples),  one  of  the  3  equalizers  produces 

_  _ • _ _  _ f _ _  -...1 _  A  _ _ _ _ _ _  _ ^^11 _ 


Equalizer  Producing 
Superior  Results 

Target  Vector 

1 

(0.5,  -0.5.  -0.5) 

2 

(-0.5,  0.5,  -0.5) 

3 

(-0.5,  -0.5,  0.5) 

The  process  is  repeated  for  each  of  64  received  symbols  such  that  a  training  set  of  64  input- 
target  pairs  is  created. 

Network  Parameters 


Paradigm 

Backpropagation 

Input  Nodes 

18 

Hidden  Nodes 

3 

Output  Nodes 

3 

LeamRate 

0.2 

Momentum 

0.0 

Update  Interval 

64 

65 


Number  of 
Passes 

1719 

Training  Size 

110016 

SPW  Iterations 
per  vector 

1 

Results 


A  Baclq)ropagation  networic  with  18  input  nodes,  3  hidden  nodes,  and  3  output  nodes 
(Figure  3)  was  used  to  learn  the  training  data  set  After  1(X)K  iterations,  the  error,  rms 
error,  ami  ouq>ut  signals  appeared  as  shown  in  Figure  4.3-1.  Note  that  the  error  signal  at 
all  times  was  less  than  about  O.OS.  Any  errors  greater  than  0.5  would  lead  to  less  than 
100%  accuracy  on  the  training  data.  This  means  that  the  Neural  Network  output  could  be 
thresholded  to  yield  100%  accuracy  on  the  training  data. 

Potential  Extensions 

The  structure  previously  described  uses  a  Neural  Network  to  make  decisions  as  to  which 
signal  stream  is  most  cmrect.  It  does  this  for  multipath  cases  which  the  equalizers  are 
previously  trained  to.  This  dynamically  adaptable  equalizer  structure  can  be  extended  to 
cover:  a)  a  greater  range  of  multipath  delay,  b)  variation  in  the  relative  strength  of  the 
primary  and  secondary  signal  components,  and  c)  intermediate  values  of  multipath  delay 
and  relative  strength  which  fall  between  the  fixed  values  for  which  the  bank  of  equalizers 
have  been  trained. 

Lessons  Learned 


The  Target  vector  involved  in  this  e)q)eriment  consisted  of  values  of  either  -0.5  or  40.5.  In 
other  exanq)les  it  has  been  observed  that  training  is  e]q)editBd  (when  ’’binary”  target  values 
are  i4)propriate)  when  the  target  values  are  not  the  saturation  values  for  the  sigmoid.  For 
examples,  -0.4  and  40.4  could  be  used.  However,  this  particular  experiment  showed  no 
si^iificant  performance  advantage  using  one  method  over  the  other. 
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Figure  4.3-1 


Various  Signals  in  Backpropagation  Network  EqualizadcHi 
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4.4  DEMODULATION  OF  NON-LINEARLY  DISTORTED  16-QAM 
Application 

Demodulation  of  Non-Linearly  Distorted  16  QAM  using  Backpiopagadon 

Introduction 

This  application  examined  the  use  of  Neural  Network  demodulation  techniques  to 
counteract  the  effects  of  non-linear  channels. 

Overview 

A  Bac]q;nopagation  Neural  Network  was  used  to  reoxistruct  the  signal  constellation  of  a  16 
QAM  sign^  which  has  been  passed  through  a  Travelling  Wave  Tube  amplifier  (TWT)  and 
an  AWGN  channel.  The  TWT  causes  the  16  QAM  signal  to  be  distorted  in  a  non-linear 
maimer  such  that  the  comers  of  the  ccmstellation  become  rounded. 


16  QAM 

BBC 

Backpiopagation 

Neural 

Source 

Network 

AWGN 


To  restore  the  transmitted  signal  constellation,  we  require  a  non-linear  mapping  from  the 
distorted  input  to  the  undistoited  target  The  64  QAM  multipath-distorted  signal  previously 
examined  (in  4.2  and  4.3)  exhibited  only  linear  distention.  In  that  case,  the  target  symbol 
was  a  linear  combination  of  previous  and  present  received  symbols.  In  the  case  of  16 
QAM  distorted  by  a  TWT,  the  Neural  Network  must  emulate  a  non-linear  function. 

The  training  set  was  16  received  vectens,  each  of  size  2,  corresponding  to  a  single  sample 
of  each  of  &e  16  QAM  symbols.  The  target  vecunr  for  each  input  was  the  4-bits  (+/-  1) 
corresponding  to  each  symbol  and  scaled  by  0.4  to  remain  inside  the  limits  of  the  output 
sigmoids. 

Traveling  Wave  Tube  AmpliHer 

This  simulation  models  the  AM-to-AM  and  AM-to-PM  characteristics  for  a  typical 
Traveling  Wave  Tube  (TWT)  amplifrer.  The  input  and  output  are  complex  envelope 
representations.  The  equation  coefficients  and  the  operating  point  (dB)  are  the  parameters 
of  the  TWT  model. 

The  TWT  anmlifier  is  implemented  using  the  following  equations  for  am/am  and  am/pm 
conversions  [2]. 

afT  a0r2 

A(r)= -  0(r)  = - 

1  +  Brr2;  1  +  B0r2 

The  coefficioits  ar,  Br,  a0,  and  b0  are  specified  as  parameters. 
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Network  Parameters 


Paradigm 

Backpropagation 

Input  Nodes 

2 

Hidden  Nodes 

16 

Output  Nodes 

4 

Learn  Rate 

0.5,  0.25  * 

Momoitum 

0.5 

Update  Interval 

16 

Number  of 
Passes 

3125, 3125  * 

Training  Size 

50K,  50K  * 

SPW  Iterations 
per  vector 

1 

*  This  netw<Mk  was  trained  in  two 
{biases.  The  first  number  gives  the 
value  of  the  parameter  in  the  1st  phase; 
the  second  gives  the  value  in  the  2nd 
phase. 


Results 

We  found  that  the  Baclqm^gadon  Neural  Network  was  capable  of  mapping  distorted  16 
QAM  into  the  corresponding  4-bit  vector  under  varying  amounts  of  noise.  Figure  4.4-1 
compares  the  BER  for  the  Neural  Network  and  a  "Slicer”.  The  Sheer  is  simply  a  decision 
device  whose  decision  thresholds  are  fixed  to  that  of  ideal  16  QAM.  The  deasion  regions 
fcxriMd  by  die  Sheer  are  shown  in  Figure  4.4-2. 

Lessons  Learned 


We  achieved  faster  convergene  to  a  solutimi  by  modifying  the  error  signal  utihzed  by  the 
Backpn^agation  Networic.  TypicaUy,  the  error  signal  for  a  neural  network  is  chosen  to  be: 


e  =  target  -  network  ouqiut 


We  found  that  using  an  error  signal  equal  to 

e  =  target  -  Quantized  network  ouqiut 

provided  much  faster  convergence.  We  quantized  the  network  output  in  the  following 
manner 

Quantized  network  output »  .4,  if  the  network  output  >  0 

=  -.4,  if  the  network  output  <  0 

Note  that  since  each  connxMient  of  the  target  symbol  is  either  0.4  or  -0.4,  this  causes  the 
error  signal  u>  be  0.8,  -0.8,  or  0.0.  This  choice  of  error  signal  avoids  changing  any 
weights  in  the  neiural  netwoik  when  they  are  producing  the  correct  quantized  output. 
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Potential  Extensions 

Superior  performance  could  be  achieved  by  pre-distorting  a  16  QAM  signal  prior  to 
transmission  in  a  way  that  causes  TWT  amplification  to  result  in  the  ideal  shape  instead  of 
distorted  QAM. 
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Figure  4.4-1  Bit  Error  Rate  Performance 
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4.5  DEMODULATION  OF  QPSK  WITH  BACKPROPAGATION 
Application 

Demodulation  of  QPSK  over  a  Non-Linear,  Dispersive  Qiannel  using  a  Backpropagation 
Neural  Network 
Introduction 

This  application  demonstrated  a  major  area  of  advantage  for  Neural  Networks  over 
conventional  techniques.  The  use  of  Backpropagation  neural  networks  to  adapt  to  non¬ 
linear  functions  gives  considerable  improvement  over  Linear  Equalizers. 

Overview 

QPSK  was  transmitted  over  a  dispersive,  discrete  chaimel  using  the  block  diagram  below: 


The  dispersive,  discrete  channel  was  modeled  by  die  transfer  function : 

H(z)  =  0.3482  +  0.8701Z-1  +  0.3482z-2 
where  z  represents  a  delay  of  1  symbol 

A  channel  of  this  form  was  chosen  as  in  [1]  and  [2]  to  emulate  a  situation  where  a 
transmitted  symbol  interferes  with  the  previous  and  next  symbol  due  to  dispersive  effects. 

Furthermore,  the  channel  imparts  a  non-linearity  of : 

y  =  0.5x3 

where  x  is  the  output  of  the  dispersive  portion  of  the  channel  model  and 
y  is  the  output  of  the  non-linearity 

Finally,  AWGN  was  added. 

The  I  and  Q  values  of  the  last  3  received  symbols  was  input  to  the  Neural  Network, 
requiring  6  input  nodes. 
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Network  Parameters 


Results 

In  Figure  4.3-1,  Plot  A1  shows  the  signal  constellation  prior  to  the  non-linearity.  Plot  A2 
shows  the  constellation  at  the  Equalizer  and  Neural  Network  Input  A  performance 
comparison  was  made  between  the  Linear  Adaptive  Ejqualizer  and  the  Bac^ropagation 
Neu^  Netwoik  in  terms  of  Bit  Enor  Rate  (BER),  showing  a  distinct  improvement  in  favor 
of  the  Neural  Netwoik  (Figure  4.3-2).  The  Linear  Equali^  was  unable  to  compensate  for 
the  channel  distention  regt^ess  of  noise  power  level  while  the  Neural  Equalizer  was  able 
to  significantly  reduce  the  channel  effects  and  result  in  a  much  lower  BER. 

The  BER  curves  show  the  results  of  several  methods  of  applying  a  Neural  Network  to  this 
problem: 

1.  Train  the  Neural  Netwoik  to  an  intermediate  or  expected  value  of  noise  power,  then  fix 
the  weights. 

2.  Train  the  Neural  Netwoik  on  a  noiseless  case,  then  fix  the  weights. 

3.  Continuously  train  the  Neural  Netwoik  as  the  noise  power  varies. 

The  BER  curves  corresponding  to  the  first  and  second  methods  described  above  are  shown 
in  the  diagram  below.  A  BER  curve  for  the  third  method  would  show  it  to  be  superior  to 
both  of  the  other  methods  for  all  ranges  of  noise  power. 

Potential  Extensions 

Providing  for  Decision  Feedback  in  the  Backpropagation  Network  should  improve 
poformance.  Comparisons  can  then  be  made  to  D^ision  Feedback  Equalizers. 
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Figure  4.5*  1  16-QAM  Signal  Constellations  m  Various  Points  in  tfie  Channel  Model 
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4.6  DEMODULATION  OF  QPSK  WITH  KOHONEN-OUTSTAR 
Application 

E)emodulation  of  QPSK  using  a  Configuration  of  Kohonen  and  Outstar  Neural  Network 

Introduction 

The  Kohonen  Self-Organizing  Feature  Map  is  useful  in  categorizing  distributed  input 
vectors  into  exemplar  vectors.  Given  a  received  signal  constellation,  the  Kohonen 
Network  was  used  to  adapt  its  weights  without  the  use  of  a  training  signal  such  that  the 
resulting  weights  represented  the  id^  transmitted  symbols  minus  the  effects  of  AWGN. 

Overview 

Initial  examination  of  Kohonen  Topological  Map  applications  to  signal  processing  was 
done  using  QPSK  transmission  over  an  AWGN  Channel.  QPSK  symbols  were 
represented  with  complex  envelope  representation  at  1  in-phase  and  quadrature  sample  per 
symbol.  QPSK  plus  noise  was  input  to  a  Kohonen  Neur^  Network  with  16  nodes.  The 
Kohonen  output  was  sent  to  an  Outstar  which  effectively  mapped  each  winning  Kohonen 
node  to  a  specific  QPSK  symbol.  The  trained  configuration  resulted  in  a  network  which 
could  make  decisions  on  received  QPSK  symbols  bas^  upon  their  Euclidean  distance  from 
the  ideal  pPSK 
symbol  positions. 
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The  Kohonen  weights  will  converge  to  or  near  the  centers  of  each  of  the  four  QPSK 
clusters  in  the  received  signal  constellation.  The  Outstar  network  simply  learns  a  mapping 
between  a  each  Kohonen  weight  and  an  ideal  QPSK  symbol.  The  resulting  trained 
Kohonen/Outstar  configuration  performs  quantization  of  received  QPSK  symbols  in  a 
maimer  equivalent  to  a  QPSK  slicer. 


That  is,  any  received  symbol  is  mapped  to  the  corresponding  ideal  QPSK  symbol  based  on 
which  quadrant  in  the  signal  constellation  it  lies.  For  communication  constellations  with 
only  AWGN  distortion,  optimum  decision  regions  are  determine^.’  by  linear  dissections  of 
the  signal  constellation.  Thus  this  Neural  Network  configuratic'.  provides  no  advantage 
over  conventional  techiuques  in  terms  of  performance.  However,  when  there  is  non-linear 
distortion  present  and/or  dynamism  in  the  channel,  Kohonen/Outstar  contigurations  can 
give  BER  advantages  due  to  their  ability  to  form  optimum  decision  regions,  under  certain 
conditions.  These  situations  are  examined  in  die  next  sections. 

Results 

In  Figure  4.6-1,  Plot  A1  shows  the  noisy  received  signal  constellation.  Plot  A2  shows  the 
output  constellation  over  the  last  10000  symbols  during  training.  Note  that  the  resulting 
exemplars  have  settled  to  the  centroids  of  the  4  QPSK  symbol  clusters.  Symbol  decision 
for  received  I-Q  pairs  may  be  made  based  on  their  distance  fiom  each  of  these  exemplars. 
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Figure  4.6'!  ^’SK  Input  and  Output  Constellations 
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4.7  DEMODULATION  OF  NON-LINEARLY  DISTORTED  I6-QAM 
Application 

Demodulation  of  Non-Linearly  Distorted  16  QAM  using  a  Configuration  of  Kohonen  and 
Outstar  Neural  Networks 

Introduction 

When  there  is  non-linear  distortion  present  and/or  dyname  variation  in  the  channel, 
Kohonen/Outstar  Neural  Network  ccMifigurations  can  give  improved  bit  error  rate  (BER) 
performance  due  to  their  ability  to  adaptively  uiq)rove  decision  legicms. 

Overview 

The  use  of  a  Kohonen/Outstar  configuration  can  be  used  to  adaptively  maintain  near¬ 
optimum  decision  regions.  A  Kohonen  network  with  16  weights  (1  weight  for  each  16 
QAM  symtx>l)  will  converge  to  a  situation  where  each  weight  lies  at  or  near  the  center  of 
the  corresponding  distorted  QAM  symbol  cluster,  when  AWGN  is  added  to  the  TWT 
output.  Instead  of  making  symbol  decisions  based  upon  a  linear  dissection  of  the 
constellation,  the  Kohonen  networic  will  map  a  received  symbol  plus  noise  to  the  closest 
(by  Euclidean  distance)  Kohonen  weight  The  Outstar  maps  each  Kohonen  weight  to  an 
ideal  16  QAM  symbol  after  a  brief  training  period  given  a  training  signal.  In  this  manner, 
BER  performance  is  improved  with  the  neural  conriguration. 


The  Travelling  Wave  Tube  (TWT)  Amplifier  jnodi^  a  compresrion  of  the  16  QAM  signal 
ccmstellation  due  to  AM- AM  and  AM-PM  conversion.  The  non-linear  efiects  worsen  as  the 
output  power  increases.  For  this  reason,  ou^ut  power  levels  beyond  a  certain  point  are  not 
feasible  using  conventional  techniques. 
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The  constellation  diagram  in  Hgure  4.7-1,  Plots  A3  and  A5  display  the  resulting  signal  for 
16  QAM  through  a  T^,  through  a  bandlimited  channel,  plus  AWGN. 

The  Kohonen  network  will  not  converge  to  the  centers  of  the  transmitted  symbols. 
However,  if  the  effects  of  ISI  were  eliminated,  the  resulting  constellation  would  be  as  in 
Figure  4.7-1,  Plots  A4  and  A6,  and  the  Kohonen  network  could  converge  to  cluster 
centers.  The  removal  of  ISI  is  easily  perfonned  by  linear  equalizatitm.  A  Backpropagation 
Neural  Network  has  also  been  shown  to  be  ct^>able  of  linear  equalization. 

The  following  structure  is  capable  of  demodulating  a  received  signal  as  in  Figure  4.7-1, 
Plots  A4  and  A6: 


In  this  structure,  an  N-tap  Linear  Equalizer  is  used  to  remove  the  ISI.  This  may  be  done 
ctnventionally  as  shown,  or  with  a  Backpit^)agation  Network.  In  the  diagram  above,  the 
ouqnit  of  the  equalizer  is  quantized  to  the  nearest  symbol.  The  difference  between  the 
quantized  and  uix|uantized  ou^t  is  an  error  signal  which  is  scaled  and  fed  back  to  adjust 
the  weights  of  die  equalizer  m  an  adaptive  manner.  In  the  structure  above,  symbol 
decisions  are  made  based  on  the  nearest  Ktdionen  weight  vector.  Given  that  Ae  ISI  is 
removed  by  the  equalizer,  the  Kohonen  Network  can  used  to  End  the  centers  of  the 
resulting  symbol  clusters,  which  represent  the  optimum  or  near-optimum  quantal  levels  for 
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making  symbol  decisions.  An  Outstar  Network  im^s  each  Kohonen  weight  vector  to  an 
ideal  symbol  value.  Thus,  the  Kohonen/Outstar  co^guradon  in  tandem  with  the  Linear 
KqiialiTw  (or  Backpix^agation  Network)  yields  better  performance  than  either  alone. 


Figure  4.7  - 1  Unequalizcd  and  Fi|naliTad  16^AM  Signal  Constellations  for  Varying 

Degrees  of  Channel  Noise 
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Network  Parameters 


Paradigm 

Kohonen 

Outstar 

Input  Nodes 

2 

16 

Hidden  Nodes 

N/A 

N/A 

Output  Nodes 

16 

2 

Learn  Rate 

* 

0.05 

MtHnentum 

N/A 

0.05 

Update  Interval 

1 

1 

Number  of 
Passes 

* 

25K 

Training  Size 

* 

25K 

SPW  Iteration 
per  vector 

16 

16 

Results 


*  Kohonen  network  was  trained  in  two 
phases.  In  Phase  1.  training 
preceded  conventionally  with  a 
learning  rate  of  0.7.  Ph^  1 
training  was  in  effect  for  16K 
iterations.  At  the  completion  of 
Phase  1  training.  Phase  2  training 
began.  Phase  2  training  was  in 
adaptive  mode  where  only  the 
weights  to  the  winning  Kohonen  node 
were  updated.  The  learning  rate 
during  Phase  2  training  was  0.1. 


Figure  4.7-2  compares  the  BER  of  a  Linear  Equalizer  alone  against  that  of  the 
Kohonen/Outstar/Equalizer  hybrid. 
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Figure  4.7-2  Bit  Error  Rate  Perfotnance 
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4.8  DEMODULATION  OF  16-QAM  OVER  A  RAYLEIGH 
CHANNEL 

Application 

Demoduladon  of  16  QAM  over  a  Rayleigh  Channel  using  a  Configuration  of  Kohonen  and 
Outstar  Neural  Netwcn-ks 

Introduction 

We  investigated  the  use  of  a  combined  Linear  Equalizer  and  Kohonen  network  to  "learn 
and  track"  £e  dynamic  movement  of  a  16  QAM  ccMistellatiai  when  subjected  to  a  Rayleigh 
Fading  Channel. 

Overview 

A  more  extensive  iq)plication  of  the  Kdionen  network  has  been  formulated.  Mobile 
communication  experiences  dynamic  channel  distortion  which  has  been  modeled  by 
Rayleigh  Fading  Channels.  In  such  channels,  the  received  signal  constellation  is  both 
rotated  and  attenuated  as  functions  of  time.  Typically,  PSK-based  communication 
schemes  are  used  in  mobile  communications  since  fdl  transmitted  symbols  are  of  the  same 
magnitude.  This  allows  for  Automatic  Gain  Control  to  compensate  for  the  variable 
attenuation  induced  by  the  channel.  However,  PSK-based  communication  is  not  as 
spectrally  efficient  as  QAM-based  communication.  Use  of  Automatic  Gain  Control  for 
QAM  is  more  complex  because  QAM  symbols  are  of  different  magnimdes.  The 
application  of  the  Equalizer/Kohonen  hybrid  to  the  Rayleigh  channel  was  examined  using 
the  block  diagram  below: 


Rayleigh  Fading  Channel 

The  Rayleigh  channel  in  the  above  block  diagram  represents  the  flat  (or  single  ray) 
Rayleigh  fading  channel  model  used  to  model  Ae  effects  of  multiple  point  scatters  i  the 
neighb^ood  of  a  moving  receiver  in  mobile  communications. 

The  ouq>ut  of  the  block  is  simply  the  input  times  a  single  conq)lex  time-varying  weight. 
The  weight  is  cidled  the  Rayleigh  channel  weight  and  is  generated  by  passing  complex 
white  Gaussian  noise  through  a  uuling  filter  and  then  interpolated  the  output  of  the  fa^ng 
filter.  The  fading  filter,  also  referred  to  as  the  spectrum  shaping  filter,  is  based  on  Jake's 
model  [Jake,  1974]  and  has  a  frequency  response  of. 
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if  Ifl^fd 


H(0= _ 1 _ _ 

[l-(f/^2]0.5 

H(0=  0  ,  if  lfl>fd 

whor  ^  is  the  IDoppler  frequency. 

Since  the  Doppler  firequency  is  usually  much  less  than  the  sampling  frequency  the  fading 
filter  response  H(f)  is  usually  a  very  narrow  lowpass  filter. 

Kohonen  Equalizer 

In  an  earlier  experiment,  we  found  that  a  Kohonen/Outstar  configuration  could  be  used  to 
form  near-optimum  decision  regions  for  a  transmitted  signal  constellation  that  has  been 
corrupted  by  non-linear  effects,  ISI,  and  noise. 


The  final  extension  to  the  above  configuration  is  useful  for  signaling  which  has  been 
distorted  by  non-linearity,  ISI,  and  noise  in  a  dynamic  sense.  Consider  using  the 
winning  Kohonen  weight  vector  as  the  target  signal  for  the  computation  of  the  feedback 
error  signal.  If  the  received  constellation  of  Figure  4.7-1  Plot  A4  is  also  rotating, 
compressing,  and  expanding,  as  in  a  Rayleigh  channel,  the  conventional  equalizer  will  not 
be  able  to  keep  up  under  certain  conthdons.  A  mo^cadon  to  the  Kohonen  learning 
algorithm  allows  the  weight  vectors  to  track  the  movement  of  the  constelladon,  thus 
relaxing  the  rate  at  which  the  linear  equalizer  must  adapt  The  use  of  the  winning  Kt^onen 
weight  vector  for  the  computadon  of  the  error  signal  further  assists  the  adaptadon.  It  is 
also  hj^thesized  that  a  convendonal  equalizer  will  be  unable  to  eliminate  ISI  if  the 
transmitted  constelladon  is  non-linearly  distorted  beyond  a  certain  point  unless  given  a 
more  accurate  desired  (or  target)  signal  from  which  to  derive  the  error  signal.  The 
configuradon  below  can  accommodate  this  more  accurate  target  signal  via  feedback  from 
the  winning  Kohonen  weight  vectOT. 
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Network  Parameters 


Paradigm 

Kohonen 

Outstar 

Input  Nodes 

2 

16 

Hidden  Nodes 

N/A 

n/A 

Output  Nodes 

16 

2 

Learn  Rate 

0.7 

0.05 

Momentum 

N/A 

0.05 

Update  Interval 

1 

1 

Number  of 
Passes 

* 

12K 

Training  Size 

* 

12K 

SPW  Iteration 
per  vector 

1 

1 

*  Kohonen  netwOTlc  was  trained  in  two 
phases.  In  Phase  1,  training 
proceded  ccmventionally  with  a 
learning  rate  of  0.7.  Ph^  1 
training  was  in  effect  for  16K 
iterations.  At  the  completicHi  of 
Phase  1  training.  Phase  2  training 
began.  Phase  2  training  was  in 
adaptive  mode  where  only  the 
weights  to  the  winning  Kohonen  node 
were  updated.  The  learning  rate 
during  Phase  2  training  wasl.O 


Results 

Results  of  this  experiment  are  best  displayed  via  the  corresponding  interactive  SPW 
demonstration.  Refer  to  the  Appendix  for  the  demonstration  procedures. 
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4.9  IMPROVING  SOFT  DECISIONS  IN  A  JAMMING 
ENVIRONMENT 

Application 

Improving  Vital)!  Decoder  Soft  Decisitms  in  a  Pulse  Jamming  Environment 

Introduction 

Convolutional  coding  of  data  has  been  shown  to  improve  p^ormance  of  communications 
systems  in  the  presence  of  additive  white  Gaussian  noise  (AWGN).  Decoding  of 
ccmvoluticmally  encoded  data  in  an  AWGN  scenario  is  q>timally  done  via  Viterbi  decoding. 
Supplying  soft  decisions  to  a  Viterbi  decoder  instead  of  hard  decisions  can  give  a 
performance  increase  of  i^proximately  2  dB.  We  propose  an  adaptive,  non-linear  method 
of  supplying  the  soft  decisions  which  yields  a  p^ormance  gain  in  a  pulse  jamming 
environment  This  technique  is  similar  to  that  ^plied  in  several  p£q)ers  [Asato,  Grover  & 
Cahn,  “Artificial  Neural  Network  Adaptive  Non-linear  Digital  Receivers”].  [Anderson, 
“Generation  of  Soft  Information  in  a  Ir^uency  Hopping  HF  Radio  System  Using  Neural 
Netwmks”  \filcon  *92  Proceedings  Vol.  2].  TUs  ad^tive  receivo*  structure  can  adjust  to 
varying  jammer  condidons.  It  requires  no  training  signal,  no  knowledge  of  pulse  jammer 
duty  cycle,  channel  emx'  rate,  or  jammer  magnitude.  If  the  jammer  were  to  be  permanently 
turned  off  or  its  duty  cycle  were  to  change,  the  network  would  change  its  soft  decision 
metric  fimcdon  appropriately. 

Overview 

This  neural  network  application  was  studied  u^g  die  block  diagrams  below: 


Transmitter  and  Channel  Model 


86 


Receiver  Model 


Random  binary  data  vsdth  equally  likely  probability  of  0  and  1  was  encoded  by  a  rate  1/2, 
constraint  len^  5  convolutional  encoder.  This  encoded  stream  modulated  a  binary  phase 
-shift  keying  (BPSK)  tramsitter,  whose  ouq)uts  were  +1  or  -1.  The  channel  consisted  of 
background  AWGN  and  a  pulse  jammer  as  described  below.  The  received  signal  was 
demodulated  and  input  to  a  Baclqnrc^gation  Neural  network  which  adaptively  generates  a 
soft  decision  metric  for  input  to  Ae  Viterbi  decoder.  The  Viterbi  decoder  utilizes  the  soft 
decisions  quantized  to  8  levels  to  produce  the  estimate  of  the  transmitted  data  stream. 

The  weights  of  a  Baclq>ix>pagaticm  Neural  Network  are  adjusted  via  an  error  signal  which  is 
calculate  based  upon  a  comparison  between  the  network  output  and  a  target  signal.  In 
some  a{^lications,  a  training  signal  is  available.  In  situations  where  a  training  signal  is  not 
available,  the  target  signal  must  be  estimated  fimn  information  at  hand.  In  this  application, 
a  fairly  good  estimate  of  the  transmitted  signal  is  the  Viterbi  decoder  output  The  neural 
netwo^  will  produce  a  soft  decision  metric  which  will  be  used  by  the  det^er  to  produce 
an  estimate  of  the  message  data.  The  j<^  of  the  neural  networit  is  to  learn  to  use  the 
demodulator  output  to  give  the  decoder  an  accurate  metric.  For  this  reason,  we  re-encode 
the  decoder  ouqiut  for  use  in  calculating  the  error  signal.  The  re-encoded  decoder  output  is 
a  good  estimate  of  the  actual  signal  fcv  comparison  with  the  received  noisy  signal. 


The  Viterbi  decoder  and  encoder  contain  intonal  delays  which  must  be  accounted  for  when 
using  their  output  in  calculation  of  error  signals.  For  this  reason,  two  Backpropagation 
Neui^  Networks  are  used.  The  error  signal  is  applied  to  the  bottom  neural  network.  The 
input  to  the  bottom  neural  networic  is  a  delayed  version  of  the  demodulator  output,  in  order 
to  compensate  fOT  internal  decodo*  and  encoder  delays.  The  weights  of  the  bottom  neural 
network  are  transferred  to  the  top  neural  network  immediately  upon  change  for  use  in 
calculating  the  soft  decision  metrics. 


Ideally,  the  soft  decision  metric  pven  to  the  Viterbi  decoder  for  an  arbitrary  demodulated 
signal  level  is  the  k>g-likelihood  Action.  It  is  the  logarithm  of  the  likelihood  ratio  which  is 
the  quotient  of  the  probability  of  bit  correctness  divided  by  the  probability  of  bit  error, 
given  the  demodulate  signal: 


Metric  -  log 


Prfcorrect/demodulated  valuel 
Pr(incovrect/demodulated  value) 


The  Vitorbi  decoding  algorithm  fev  convolutional  codes  is  equivalent  to  maximum 
likelihood  decoding  ai^  thus  is  qmmum  fOT  equally  likely  messages. 
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Most  conventional  soft  decision  inq)lementations  are  designed  for  stationary  AWGN 
degradation.  In  many  situations,  channel  noise  is  not  stationary,  such  as  in  pulse  jamming 
environments.  In  AWGN  environments,  the  ideal  soft  decision  function  is  simply  a  linear 
function  of  the  demodulator  output  For  BPSK  signalling  with  transmitted  symbols  +1  and 
-1,  demodulator  outputs  near  0  would  be  assigned  a  small  metric  since  it  is  not  certain 
whether  a  +1  or  -1  was  actually  transmitted.  Large  positive  (or  negative)  demodulator 
ouq)uts  would  be  assigned  a  large  positive(or  negative)  metric  since  it  is  nearly  certain  that 
a  +1  (or  -1)  was  transmitted  since  AWGN  is  unlikely  to  account  for  such  a  change  in 
received  sig^  voltage. 


Conv«itional  Soft  Decision  Metric 


Pulse  Jammer 

The  pulse  jammer  is  modeled  as  an  AWGN  noise  source  that  is  switched  on  and  off  at  a 
particular  duty  cycle.  When  the  jammer  is  off,  the  noise  caused  by  the  channel  is  simply 
that  of  the  background  noise,  No/2.  When  the  jammer  is  on,  the  noise  power  is  Nj/2  » 
No/2.  This  jammer  will  cause  bits  transmitted  during  the  "on”  times  to  vary  greatly  in 
magnitude.  Thus  it  is  much  less  certain  whether  a  +1  or  -1  was  transmitted.  For  this 
reascxi,  that  bit  should  be  assigned  a  low  metric. 

Error  Signal 

The  errcx'  signal,  E,  is  the  difference  between  the  Neural  Network  Target  and  the  Neural 
Networic  Output  (OUT) 

E  =  Target  -  OUT 

The  Target,  and  hence  E,  is  dependent  upon  a  comparison  between  the  quanitized 
demodulated  signal  and  the  encoder  ouq)ut. 

Let  D  denote  die  ouqiut  of  the  BPSK  demodulator,  and  define  the  “sign”  of  D  to  be: 

S(D)  s  +1  if  D  is  positive  or  zero,  and 
S(D)  s  -1  if  D  is  negative. 

Also  define  the  demodulator’s  “hard”  bit  decision  to  be: 

H(D)  s  1  if  D  is  positive  or  zero,  and 
H(D)  s  0  if  D  is  negative. 
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Then  the  Target  is  defined  to  be: 

Target  =  S(D)IOim(l+l/R)  if  H(D)  =  encoder  output 

Target  =  0  if  H(D)  does  not  equal  encoder  output 

In  this  equation,  R  denotes  the  estimated  likelihood  ratio. 

Since  the  Neural  Network  Output,  OUT,  is  trained  to  the  log-likelihood  function,  the 
estimated  likelihood  ratio,  R,  is  defined  as: 

r=10IOUTI 


Thus  when  the  neural  network  has  supplied  a  metric  which  has  resulted  in  an  incorrect  bit 
decision,  the  metric  is  driven  towards  0.  When  a  correct  bit  decision  is  made,  the  metric  is 
re-inforced  by  an  amount  (1  +  1^)  which  ideally  will  balance  the  metric  at  a  value 
according  to  its  likelihood  ratio.  For  example,  if  a  given  demodulated  value  D  has  a 
likelihood  ratio  of  Ri>  then  1/Rq  of  the  times  that  D  is  input  to  the  network  will  result  in  an 
iiKxnrect  bit  decision,  giving  a  Target  of  0.  To  offset  this  and  allow  the  network  to  stabilize 
at  the  true  value  of  logfRo),  we  must  supply  a  target  of  1+1/R  times  the  network  output 
when  a  correct  bit  conqrarison  is  made. 


*  Lineariy  decreasing  from  9.0  at  the  rate  of 
-43t/40K  (where  t  is  the  SPW  iteratim 
count)  over  the  1st  2SK  iterations.  Learning 
rate  equals  0.1  afterwards. 


Simulation  Parameters 


Paradigm 

Backpropagation 

Input  Nodes 

1 

Hidden  Nodes 

10 

Output  Nodes 

1 

Learn  Rate 

* 

Momoitum 

0.0 

Update  Interval 

1 

Number  of 
Passes 

Always  in  training 

Training  Size 

N/A 

SPW  Iteration 

1 

per  vector 

The  neural  network  was  initally  trained 

to  an  identity  function  to  approximate  the  conventimial  soft  decision  metric.  These  initial 
conditions  are  necessary  fw  the  netwcvk  to  give  reasonable  metrics  to  the  Viteribi  decoder 
at  startup  for  convergence  to  occur.  Training  cmisisted  of  varying  the  learning  rate  from 
9.0  to  4.5  linearly  over  the  first  2SK  iterations,  then  applying  a  constant  0.1  learning  rate 
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over  die  next  40K  iterations.  The  initial  large  learning  rates  accelerated  the  learning  process 
to  a  point  where  finer  adjustment  with  a  learning  rate  of  0. 1  could  begin. 

Theoretical  Background 

The  benefit  of  the  neural  net  approach  is  to  enable  the  use  of  near-optimal  log-likelihood 
ratios  without  requiring  any  knowledge  of  jammer  power  or  duty  factor.  It  automatically 
learns  this  during  operation,  and  it  tracks  any  changes  in  these  characteristics  as  the 
jamming  environment  changes.  To  see  how  important  this  might  be  we  can  calculate  the 
ideal  log-likelihood  ratio  that  would  be  used  as  a  metric  and  examine  the  changes  in  the 
ratio  as  the  jamming  environment  changes.  The  denaodulator  output  probability  density 
function  as  a  function  of  the  output  voltage  n  is  the  Gaussian  pdf,  N(n4n,s),  where 

N(v,n,o)=l_^ 

V2jca* 

and  where  the  densodulaUH’  output  voltage  in  the  absence  of  noise  is  assumed  to  be  m  and 
the  standard  deviation  of  the  noise  process  is  s.  The  correctly  received  signal  is  assumed  to 
have  m  =  where  Eg  is  the  received  energy  per  channel  symbol.  The  noise  processes 
are  assumed  to  be  such  that  when  the  pulse  jammer  is  on  a  fraction  d  of  the  time  the  noise 
standard  deviation  is  s  =  n  » ^  ^  fraction  (1-d)  of  the  time  the  noise  standard 
deviation  is  s  =  i*c.,  that  of  the  background  noise.  Thus,  the  equation  for  the  ideal 

log-likelihood  ratio  as  a  function  of  the  demodulator  output  voltage,  n,  the  signal  mean,  m 
=  ,  the  jammer  duty  facto',  d,  and  the  background  and  jammer  noise  standard  de¬ 

viations,  Jn„I2  anti  JnTJT  is  given  by  LLR(n)  where 

(1-d)  N(n, ,  V^TT)  +  2  )  (2) 

LLR(n)  =  log  - - - 

(1-d)  N(n,-  )  +  d  N(n, .  ) 

Behavior  of  the  ideal  log-likelihood  ratio  as  given  by  (2)  is  shown  in  Fig.  1  for  a  pulse 
jammer  20  dB  larger  than  the  background  noise  and  with  duty  factors  of  0.05,  0.1,  and 
0. 15.  The  log-likelihood  ratio  departs  significantiy  fiom  the  ideal  linear  case  for  only  back¬ 
ground  noise,  but  there  is  not  a  great  deal  of  variation  as  the  jammer  duty  factor,  s  is 
varied.  The  ixuun  effect  is  that  the  positive  aixl  negative  peaks  are  reduced  as  the  duty  factor 
increases  because  the  reliability  of  decisions  at  those  voltages  decreases.  In  contrast,  there 
is  a  much  more  significant  change  in  the  shape  of  the  i^al  log-likelihood  function  as 
jammer  power  varies  with  constant  jammer  duty  factor  as  shown  in  Fig.  2.  The  main  effect 
is  that  as  the  jammer  power  is  reduced,  the  reliability  of  bit  decisions  for  larger  net  input 
voltages  is  inqnDved  significantly  causing  the  log-likelihood  ratio  to  increase.  Hopefully, 
the  neural  net  can  coov^e  to  a  log-likelihood  function  that  is  quite  close  to  the  optimal.  In 
this  process  it  is  most  inqxHtant  Aat  the  neural  net  provide  near  zero  log-likelihoods  for 
input  voltages  cone^xaiding  to  unreliable  symbols,  ’^s  allows  the  Viteiti  decoder  to  treat 
such  symb^  as  unreliable  in  accumulating  path  metrics  thereby  minimizing  their  effect  on 
decode  bit  decisions.  Perfcxmance  will  significantly  degraded  if  larger  log-likelihood 
ratios  are  produced  fix’  input  voltages  corresponding  to  a  high  percentage  of  unreliable 
symbols.  What  this  shows  is  that  there  is  a  need  to  have  a  neui^  net  approach  that  can 
actively  track  changing  jamming  conditions  to  provide  the  decoder  with  the  best  log- 
likeUhood  ratio  metrics  at  a  given  time.  As  part  of  our  development  plan  we  want  to  insure 
that  the  approach  does  the  best  possible  job  of  converging  to  nearoptimal  metrics  while 
having  the  capability  to  quickly  track  changes  in  mnse  conditions. 
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Results 


Figures  4.9-1  and  4.9-2  displays  the  soft  decision  nietric  transfer  function  produced  by  the 
netwcnk  before,  during,  and  after  training  for  a  5%  jammer  which  is  20dB  greater  in  power 
than  the  background  signal-to-noise  ratio.  E^/Nq  is  4.5  dB. 

Figure  4.9-1  is  the  initial  transfer  function,  with  a  linear  characteristic  which  is  optimal  for 
an  AWGN  environment.  Figure  4.9-2  Plot  A1  displays  the  transfer  function  near  the  end 
of  training.  Plot  A2  shows  the  final  transfer  function. 

Hgure  4.9-3  is  a  plot  of  the  upper  half  of  the  theoretical  optimum  transfer  function. 

Figure  4.9-4  is  a  graph  oi  Bit  Error  Rate  curves  comparing  the  neural  netwOTk  performance 
to  a  conventional  linear  soft  decision  metric  aiui  tiieoretical  bounds.  The  theoretical  curves 
assume  infinite  soft  decision  quantization,  knowledge  of  pulse  jammer  on/off  times,  and 
knowledge  of  the  relative  magnitude  of  the  jammer  over  background  noise. 
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Figure  4.9- 1  Initial  Neural  Network  S<rft-Decision  Metric  (Before  Ad£^tation) 
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Hguie  4.9-3  Optimum  Soft-Decisioo  Metric  for  Pulse-Jammer  Scenario  (Positive 

Received  Signal  Energy) 
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Probability  of  Bit  Error 


Neural  Network 
5% 

Linear  5% 

No  Jammer  -  Ideal 

5%  Jammer  - 
Ideal 

10%  Jammer - 
Ideal 

Neural  Network 
10% 

Linear  10% 


Eb/No(dB) 


Figure  4.9-4  Bit  Error  Rate  Peifrxmance 
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4.10  DEMODULATION  OF  QPSK  WITH  A  RECURRENT 
NETWORK 

Application 

Demodulation  of  QPSK  over  a  Non-Linear,  Dispersive  Channel  Using  a  Fully  Recurrent 
Network 

Introduction 

We  examined  the  use  of  the  Fully  Recurrent  Network  in  a  situation  where  previously  the 
Back^pagation  Netwoik  was  applied  (in  Section  4.5).  In  particular,  the  areas  of  ch^el 
equalization  show  potential  as  areas  for  Fully  Recurrent  application.  Just  as  Decision 
Feedback  techniques  enhance  the  performance  of  Linear  Adaptive  Equalizes,  we  expect  the 
Recurrent  Network  to  be  superior  to  Backpropagation  when  used  for  channel  equalkation, 
whether  linear  or  non-linear.  This  is  due  to  Ae  Fully  Recurrent  Network  architecture 
which  contains  feedback  from  the  output  layer  back  to  the  input  layer. 

Overview 

QPSK  was  transmitted  over  a  dispersive,  discrete  channel  using  the  block  diagram  below: 


The  dispersive,  discrete  channel  was  modeled  by  the  transfer  function : 

H(z)  =  0.3482  +  0.8701r  1  +  0.3482z-2 
where  z  represents  a  delay  of  1  symbol. 
Furthermore,  the  channel  in:q)arts  a  non-linearity  of ; 

y  =  0.5x3 


where  X  is  the  output  of  the  dispersive  portion  of  the  channel  model,  and  y  is  the  output  of 
the  non-linearity 

Finally,  AWGN  was  added. 
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Network  Parameters 


Several  network  conngurations  were  examined.  The  first  was  similar  to  that  of  the 
Backprc^agation  Network  in  an  earlier  experiment  Here,  the  I  and  Q  values  of  each  of  the 
last  tl^  received  symbols  constituted  input  layer.  The  second  configuration  attempted  to 
make  use  of  the  Recurrent  Network’s  inheiant  feedback  nature  by  inputting  on  the  current 
received  symbol.  It  was  thecaized  that  the  network  would  be  able  to  form  its  own 
representation  of  the  multipath  delay. 


Paradigm 

Recurrent 

Recurrent 

6 

2 

Hidden  Nodes 

8 

6 

2 

2 

Learn  Rate 

0.01 

0.01 

Momentum 

0.5 

0.5 

10 

10 

Number  of 
Passes 

lOOK,  lOOK  * 

lOOK,  lOOK  * 

RPfRiBSPH 

IM,  IM  * 

IM,  IM* 

SPW  Iterations 
per  Vector 

1 

1 

*  Each  of  the  Recurrent  configurations  were  trained  in  two  stages.  The  first  stage  of 
training  was  with  forced  learning,  i.e.,  the  target  vector  is  fed  back  to  the  input  layer 
instead  of  the  actual  output  vector.  The  second  stage  of  training  did  not  involve /orced 
learning.  This  technique  is  necessary  since  the  Recurrent  network  uses  its  own  output  as 
input  to  the  hidden  layer  (and  also  output  layer)  nodes.  Initially  the  Recurrent  ou^ut  is 
very  error-prone  and  training  will  not  occur  if  the  nodes  are  given  meaningless  input. 

Results 

A  performance  comparison  was  made  between  the  Linear  Adaptive  Equalizer  and  the 
Recurrent  Neural  Network  in  terms  of  Bit  Error  Rate,  showing  a  distinct  improvement  in 
favor  of  the  Neural  Network.  The  Linear  Equalizer  was  unable  to  compensate  for  the 
channel  distortion  regardless  of  noise  powo^  level  while  the  Neural  Equalizer  was  able  to 
significantly  reduce  the  channel  effects  and  result  in  a  much  lower  BER. 

As  in  the  Backpropagation  application  to  this  problem,  the  BER  curves  show  that  several 
methods  of  applying  a  Nemal  Network  to  this  prc^lem: 

1 .  Train  the  Neural  Network  to  an  intermediate  or  expected  value  of  noise  power,  then  fix 
the  weights. 
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2.  Train  the  Neural  Netwoilc  on  a  noiseless  case,  then  fix  the  weights. 

3.  Continuously  train  the  Neural  Network  as  the  noise  power  varies. 

The  BER  curves  corresponding  the  the  first  method  described  above  are  shown  in  Figure 
4.10-1  for  each  of  the  Recurrent  Network  Configurations.  A  BER  curve  for  the  third 
method  would  noost  likely  be  superior  to  both  of  the  other  methods  for  all  ranges  of  noise 
power.  The  second  configuration  did  not  perform  as  well  as  the  first,  but  performed  well 
considering  it  uses  only  the  current  received  symbol  as  input.  BER  curves  from  the 
Backpropagation  application  to  this  problem  are  included  for  further  comparison. 


Eb/No(dB) 

Figure  4.10  Bit  Emn’  Rate  Performance 


5.  FUTURE  NEURAL  NETWORK  TRANSCEIVER 


A  high-level  block  diagram  of  a  generic  multiband  transceiver  is  shown  in  Figure  S- 1.  It  is 
shown  to  illustrate  where  we  are  likely  to  insert  neural  network  technology  in  the  future. 
The  most  likely  places  are  in  the  Programmable  DSP  Module  where  baseband  signal 
processing  is  p^ormed  and  in  the  Band  Switch  Controller  where  adaptive  control  is 
implemented  to  sense  the  channel  conditions  and  switch  to  another  band  (if  the  frequencies 
are  jammed  or  heavily  used).  Multiple  RF  modules  may  be  employed  to  cover  different 
bands.  The  RF  modules  along  with  the  TRANSEC  (if  needed)  and  Frequency  Synthesizer 
are  implemented  with  conventional  technology.  When  we  develop  the  Phase  II  conceptual 
design,  it  will,  of  course,  be  much  more  detailed  and  correspond  more  closely  to  the  Speak 
Easy  radio. 


NM>«4(MS) 


Figure  S-1  High-Level  Block  Diagram  of  Generic  Multiband  Transceiver  (Shaded  Blocks 
Have  Potential  for  Utilization  of  Neural  Network  Technology). 

Certain  candidate  problems  to  be  addressed  by  neural  network  technology  (interference 
cancellation,  intersymbol  interference  elimination,  multipath  combining,  etc.)  would 
normally  be  implemented  conventionally  via  an  algorithm  on  the  Programmable  DSP 
Module.  A  neural  network  solution  for  any  ot  all  of  these  problems  could  be  implemented 
in  software  on  one  of  the  DSP  processors,  or  a  neural  network  hardware  implementation 
may  be  more  desirable  because  of  signiHcantly  increased  processing  power  and  fault 
tolerance.  (SAIC  is  currently  develt^ing  a  neural  network  VLSI  ctup  under  DARPA 
contract  which  can  be  used  for  this  purpose.)  The  tradeoffs  diat  will  be  done  in  considering 
software  versus  hardware  implementations  will  be  the  subject  of  one  of  the  tasks  that 
would  make  up  a  Phase  II  Neural  Netwcnk  Transceiver  Program. 

The  (Phase  I)  Neural  Network  Communications  Signal  Processing  Program  (NNCSP)  has 
addressed  the  question  of  which  (or  what)  communications  signal  processing  functions 
should  be  considered  for  implementation  in  a  Neural  Network  Transceiver.  Because 
communications  signal  processing  is  a  very  mature  technology  a  greater  payoff  is  likely  if 
neural  netwOTk  implementations  are  considered  only  for  those  functions  for  which  greater 
perfOTmance  flexibility  may  be  obtained  or  there  is  a  processing  speed  and  fault  tolerance 
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advantage  provided  by  a  highly  parallel  neural  network  implementation.  Problems 
identified  in  Phase  I  for  which  this  may  be  true  include: 

•  Interference  cancellation  or  mitigation 

•  Intersymbol  interference  elimination 

•  Multipath  ccMiibining 

•  Joint  optimization  of  interference  cancellation,  intersymbol  interference  elimination,  and 
multip^  combining 

•  Recognition  of  nK)dulation  type  for  an  unknown  waveform. 

In  addition  to  these  problems  there  may  be  a  role  for  neural  network  technology  in 
mechanisms  for  adaptive  data  rate  selection  and  adaptive  band  selection.  However,  the 
focus  of  the  Phase  I  investigation  was  on  the  signal  processing  functions  which  are  pan  of 
the  chain  of  transmit  and  receive  functions:  source  encoding,  encryption,  error  control 
encoding,  modulation,  demodulation,  error  control  decoding,  decryption,  and  source 
decoding.  Further,  practical  considerations  excluded  consideration  in  this  program  of 
source  encoding  and  decoding  and  of  recognition  of  modulation  type.  Within  the  scope  of 
this  program,  neural  networks  were  demonstrated  to  be  effective  in:  eliminating 
intersymbol  interference,  multipath  combining,  removing  nonlinear  distortion,  and 
reducing  transient  interference. 

While  conventional  sijpal  processing  has  been  employed  to  accomplish  all  of  these 
functions  within  specinc  environments,  there  is  one  category  of  comparison  in  which 
neural  networks  provide  considerable  advantage.  This  is  the  category  involving  flexibility, 
adaptivity,  and  robusmess.  In  this  case  a  neural  netwenrk  might  not  p^orm  any  better  over 
any  narrowly  deHned  range  of  iq)plication,  but  instead  might  maintain  the  same  or  roughly 
the  same  petformance  over  a  si^i^cantly  broader  range  of  application  than  is  possible  with 
any  conventional  signal  processing  technique.  Thus  the  metric  of  interest  in  this  case 
measures  the  range  of  input  ot  environment^  variation  over  which  an  acceptable  level  of 
performance  can  be  maintained.  Furthermore,  conventional  systems  may  accomplish  a 
required  flexibility  ot  robusmess  by  employing  a  "man  in  the  loop,"  and  in  that  case  a 
neural  network  may  reduce  or  even  eliminate  Ae  need  for  manual  intervention  in  some 
functions. 

The  future  Neural  Network  Transceiver  can  take  advantage  of  improved  flexibility, 
adaptivity  and  robusmess  by  embedding  neural  network  technology  in  the  programmable 
DSP  module  which  is  highlighted  in  Figure  5-1.  The  results  of  Section  4  showed  that 
existing  neural  network  technology  can  provide  these  benefits  by  using  neural  networks  for 
linear  and  nonlinear  equalization,  signal  detection,  and  the  generation  of  soft  decision 
metrics. 

A  general,  high-level  conceptual  design  for  implementing  the  functionality  of  the 
programmable  DSP  nxxlule  is  illustrated  in  Figure  5-1.  This  particular  concepmal  design 
combines  and  implements  the  neural  network  based  adaptive  signal  processing  functions 
that  were  simulate  during  this  program  and  that  were  dei^bed  in  Sectitxi  4.  Specifically, 
these  functions  are:  a)  equalization  to  correct  for  intersymbol  interference,  b)  ^ualization 
to  correct  for  nonlinear  amplimde  and  phase  distortion,  c)  symbol  detection,  and  d) 
generation  of  soft  bit  decisions  for  a  Viteiti  error-correction  decoder.  In  Figure  S-2, 
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functions  (a)  and  (b)  are  grouped  together  in  one  functional  block  and  functions  (c)  and  (d) 
are  group^  togetlier  in  another  functional  block.  This  particular  grouping  was  chosen  for 
purposes  of  conceptual  description  and  is  not  meant  to  constrain  the  detailed  design  of  these 
hinctions. 

The  functions  are  each  separated  into  two  parts:  a  forward  processing  path  which  does  not 
include  training  and  a  delayed  trainable  path  that  uses  the  error-corrected  bits  from  the 
Viterbi  decoder  as  its  target  infcnmation.  The  adr^tive  weights  that  are  trained  in  the 
delayed  path  are  transferred  to  the  corresponding  slaved  function  in  the  forward  processing 
path.  This  type  of  configuration  was  demonstrated  for  soft  bit  decisions  in  section  4.9. 


Figure  5-2  Future  Neural  Network  Receiver 
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6.  CONCLUSIONS 


The  objectives  of  the  Neural  Network  Communications  Signal  Processing  (NNCSP) 
Program  were  all  successfully  accomplished.  Specifically,  the  achieved  objectives  of  the 
NNCSP  Program  are:  1)  the  development  and  implementation  of  a  neural  network  and 
communications  signal  processing  simulation  system  for  the  pilose  of  exploring  the 
applicability  of  nei^  network  technology  to  communications  signal  processing,  2)  the 
demonstration  of  several  configurations  of  Ae  simulation  to  illustrate  the  system's  ability  to 
model  noany  types  of  neural  network  based  communication  systems,  and  3)  the  use  of  the 
simulation  to  identify  neural  network  configurations  to  be  included  in  the  conceptual 
design  of  a  neural  network  transceiver  that  could  be  developed  in  a  phase  n  follow-on 
program. 

The  overall  goal  that  unites  the  Program  objectives  and  gives  purpose  to  their 
accomplishment  is  to  reach  a  new  plateau  in  the  state  of  the  art  of  neu^  network  based 
communications  signal  processing  (CSP).  The  state  of  the  art  that  existed  at  the  start  of  this 
Ingram  can  be  characterized  as  a  collection  of  isolated  research  efforts  that  resulted  in 
publications  which  typically  described  the  capability  and  performance  of  one  specific  neural 
netwoilc  approach  to  CSP.  The  capabilities  of  neural  networks  in  a  numb^  of  different 
CSP  i^pli^ons  had  been  demonstrated,  but-- with  few  exceptions—those  capabilities  were 
not  compared  quantitatively  with  those  of  conventional  (non-neural-network-based)  state- 
of-the-art  CSP  techniques,  and  furthermore  the  publications  in  most  cases  did  not  give 
sufficient  information  for  other  researchers  to  repr^uce  the  results  or  to  use  the  results  as  a 
foundation  upon  which  to  build. 

The  NNCSP  Program  provides  tools  and  techniques  which  can  be  used  by  future 
researchers  to  easily  compare  neural  network  and  conventional  techniques  and  to  easily 
exchange  implementation  information  with  other  researchers.  The  Neural  Network 
Communications  Simulation  System  (NNCSS)  provides  a  block  diagram  approach  to 
constructing  CSP  configurations  for  simulation,  and  both  conventional  and  neu^  network 
function  blocks  can  be  used  in  the  design  and  simulation  of  CSP  products.  The  NNCSS 
provides  the  capability  to  interchange  neural  network  nKxlules  with  similar  conventional 
signal  processing  modules,  and  makes  it  convenient  to  compare  the  performance  of  neural 
netwo^  based  approaches  with  conventicmal  approaches  within  the  same  overall  system. 
Futhermore  the  I^CSS  block  diagram  provides  in^lementation  documentation  that  can 
be  archived  in  both  paper  and  electronic  forms.  By  saving  the  simulation  block  diagram 
file  in  a  User  Library,  a  researcher  can  automatically  provide  the  means  by  which  to 
reproduce  his  or  her  results  at  a  later  time.  By  sharing  User  Library  Hies,  researchers  can 
easily  share  the  implementation  details  that  allow  other  researchers  to  reproduce  their 
results. 

As  part  of  the  NNCSP  Program,  ten  different  configurations  of  neural  networks  were 
simulated  using  the  NNCSS,  and  the  results  are  documented  in  Section  4  of  this  report 
Nine  of  those  simulatioas  demonstrated  neural  network  approaches  to  CSP,  and  seven  of 
tht^  denatmstrated  ctq)abilities  which  go  beyond  what  is  currently  being  implemented  with 
conventional  technology.  These  simulations  denwnstrate  the  power  and  versatility  of  the 
NNCSS,  and  they  demonstrate  the  adaptive  nonlinear  capability  of  neural  networks  in 
communications  signal  processing. 

The  state  of  the  art  of  conventional  CSP  includes  the  capability  of  adaptive  linear 
processing  and  some  fixed  nonlinear  processing,  but  the  design  of  conventional  CSP  is 
still  typicitily  limited  by  assumptions  of  linear  channels  and  Gaussian  interfocnce.  Neural 
netwo^  go  beyond  conventional  CSP  by  providing  the  capability  of  adaptive  nonlinear 
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processing  which  is  continually  self-adjusted  to  minimize  the  effects  of  non-Gaussian 
interference.  In  the  results  reported  in  section  4.  neural  networks  were  shown  to  be 
effective  in  several  CSP  applications:  equalization  to  correct  for  intersymbol  interference, 
equalization  to  correct  for  nonlinear  amplitude  and  phase  distortion,  symbol  detection  for 
time-varying  channels,  and  the  generation  of  soft  bit  decisions  for  a  pulse  jamming 
environment. 

Following  upon  the  results  of  this  Program,  there  are  numerous  opportunities  for  further 
research  and  development  The  recommendations  for  further  research  can  be  grouped  into 
three  categories:  further  refinement  and  simulation  of  neural  network  based  CSP 
applications,  further  improvements  to  the  NNCSS,  and  the  development  of  a  prototype 
neural  network  based  transceiver. 

Additional  opportunities  exist  for  refining  and  extending  the  applications  summarized  in 
Section  4,  and  there  are  many  more  applications  that  have  been  described  in  the  literature 
which  can  be  further  refined  and  compared  to  conventional  approaches  using  the  NNCSS. 
While  it  is  not  practical  to  name  here  all  of  the  potentially  useful  techniques  that  could  be 
investigated,  one  particular  area  of  development  is  worth  mentioning  here  because  it  is  a 
continuation  of  the  application  described  in  4.9.  In  particular,  the  neural  network  based 
generation  of  soft  decisions  can  be  extended  to  ad^tional  modulation  formats  and,  in 
addition  to  pulsed  jamming,  the  technique  is  applicable  to  atmospheric  noise,  and  to  the 
near-far  int^erence  problem  of  code-division  multiple-access.  The  recent  article  by  Asato 
and  Grover  [Asato,  1^3]  presents  performance  bounds  which  suggest  that  very  significant 
improvements  in  p^omuuice  can  be  obtained. 

The  NNCSS  in  its  initial  deliverable  version  is  a  remarkably  versatile  and  capable  design 
and  simulation  tool.  Nevertheless,  several  incremental  improvements  can  be 
recommended.  The  first  set  of  recommendations  would  be  to  add  as  options  several 
learning  acceleration  techniques  for  backpropagation  such  as  delta-bar-delta,  the 
Levenberg-Marguardt  algorithm,  and  the  entropic  error  function.  The  second 
recommendation  would  be  to  add  the  Radial-Basis  Function  Neural  Network  as  an 
additional  function  block. 

This  Program  was  strategically  positioned  and  specifically  aimed  at  the  prerequisites  needed 
prior  to  a  follow-on  program  to  develop  a  neural  network  transceiver.  The  necessary 
design  and  simulation  tools  have  been  incorporated  in  the  NNCSS,  and  the  selection  and 
simulation  of  potential  neural  network  based  CSP  functions  are  documented  in  Section  4. 
Section  5  presents  a  high-level,  conceptual  design  of  a  future  neural  network  transceiver 
which  is  recommended  for  a  follow-tm  program. 
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8.0  GLOSSARY 


ABAM 

Adaptive  Bi-directional  Associative  Memory 

ANN 

Artificial  Neural  Network 

ANS 

Artificial  Neural  System 

ART 

Adaptive  Resonance  ThetHy 

AWGN 

Additive  White  Gaussian  Noise 

BAF 

Block  Attributes  File 

BAM 

Bi-directional  Associative  Memory 

BER 

Bit  Error  Rate 

BEX 

Block  Expression  file  (binary) 

BDE 

Block  Diagram  Editor 

BP 

Baclqnopagation 

BPSK 

Binary  Phase  Shift  Keying 

BSB 

Brain  State  in  a  Box 

CCFB 

Custom  eroded  Function  Block 

CDRL 

Contract  Data  Requirements  List 

CGS 

Code  Generation  System 

CXDTS 

Commocial  Off  The  Shelf 

CPU 

Computer  Processing  Unit 

CSC 

dbmputo*  Software  Compcxient 

CSCI 

Computer  Software  Configuration  Item 

csao8 

Computer  Software  Configuration  Item  No.  8  (NNCL) 

CSP 

Communications  Signal  Processing 

csu 

Computer  Software  Unit 

CUFB 

(Tusttmi  User  Function  Block 

CUPS 

Connection  Updates  Per  Second 

Dl 

Devek^rmental  Item 

DID 

Data  Item  Description 

DMA 

Direct  Meixxxy  Access 

DOD 

Department  of  IMense 

DSP 

Digital  Signal  Processing 

EXPR 

EXPRession  file  (text). 

FIR 

Finite  Iropuse  Response 

FFT 

Fast  Fourier  Transform 

FMS 

File  Management  System 

GFI 

Government  Furnished  Item 

GUI 

Graphical  User  Interface 

HF 

Hi^  Frequency 

HWCI 

Hardware  Configuration  Item 

ISI 

Inter  Symbd  Interference 

ISL 

Interactive  Simulation  Library^ 
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ISNN 

Industrial  Strength  Neural  Network 

KTM 

Kohonen  Topological  Map 

LMS 

Least  Mean  Squared 

LTM 

Long  Term  Memory 

MCL 

Macro  Command  Language 

MHz 

Mega  (Million)  Hertz  (cycles  per  second) 

MIPS 

Million  Instructions  Per  Second 

ML 

Maximum  Likelihood 

MMI 

Man/Machine  hitof ace 

NDI 

Non-Devel(^mental  Item 

NNCL 

Neural  Network  Communications  Library 

NNCSP 

Neural  Network  Cltxnmunicatitxis  Signal  Processing 

NNCSS 

Neural  Network  Communications  Simulation  System 

NNO 

Neural  Network  Object 

NNOC 

Neural  Network  Object  Control 

NNOM 

Neural  Network  Object  Manager 

PN 

Probabilistic  Netwtnk 

QAM 

Quadrature  An^litude  Modulation 

QPSK 

Quadrature  Phase  Shift  Keying 

RC 

Recurrent 

RF 

Radio  Fiequency 

RFP 

Request  for  PrcqxKal 

RISC 

Reduced  histructioo  Set  Computer 

RLS 

Recursive  Least  Squares 

RMS 

Root  Mean  Square 

SFS 

Signal  Flow  Simulation 

SigCalc 

Signal  Calculator™ 

SOFM 

Self  Organizing  Feature  M^ 

SPB 

Simulation  Program  Builder 

SPW 

Signal  Processing  WmkSystem™ 

SSDD 

SystenVSegment  Design  Document 

SDD 

Software  Design  Document 

STM 

Short  Term  Memory 

TIL 

Tool  Interface  Language™ 

TWT 

Traveling  Wave  Ttibe 

VLSI 

Very  Large  Scale  Integration 
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APPENDIX 

PROCEDURES  FOR  RUNNING  THE  SIMULATIONS 

These  instructions  give  steps  to  execute  the  simulations  described  in  Section  4.  It  is 

assumed  that  SPW  ajid  the  associated  NNCSS  tapes  have  been  installed  and  a  simulation 

kernel  in  nncss.all  has  been  created  In  each  of  the  following  ten  simulation  procedures 

"Run  the  simulation  for  XXX  iterations"  will  mean  to  perform  the  following  operations: 

Select  Tools-SimulatOT-Run. 

Selea  More  Options 

Enter  (or  select  from  the  extended  dialog  button)  nncss_all  as  the  Simulation  Kernel. 

Press  OK 

Enter  XXX  in  the  No.  of  Iterations  field 

Press  Start 

1 .  Non-Linear  Mapping  by  a  Backpropagation  Neural  Network  (refer  to 
4,1) 

Step  1  Make  sure  that  weights  are  imtialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/spwsys/^l/nncss_all/sqr.net 

Step  2  Open  the  SPW  simulation  model  entitled  square(10).system. 

Step  3  Run  the  simulation  for  lOK  iterations.  This  will  display  the  network  learning  the 
function  0.5x3. 

Step  4  Change  the  value  of  the  constant  feeding  into  the  y-input  of  the  xV  block  to  2.0. 
This  will  display  the  network  learning  the  function  O.Sx^,  starting  with  the  weights 
which  resulted  from  training  the  network  to  learn  the  0.5x3  function. 

Step  5  Selea  File-Close  on  the  Simulation  Run  window. 

2 .  Equalization  of  Multipath  Distorted  64-QAM  using  a  Backpropagation 
Neural  Network  (refer  to  4.2) 

This  demonstration  displays  the  results  of  a  trained  Backprr^gatioi  Network  to  equalize  a 
Multipath-Disttxi^  64-QAM  signal.  The  actual  training  of  the  network  is  time- 
consuming  and  is  not  appropriate  fa  an  interactive  demonstration  using  the 
Interactive  Simulation  Libr^  (ISL).  To  execute  the  ISL  denoonstration: 

Step  1  Since  this  dononstration  displays  results  fa  a  previously  trained  network,  we  must 
begin  with  the  weights  which  resulted  from  this  earlier  training:  cp 
/spwsys4)oolAbiQ_3_aU/64qam_3.net/s^wsys/pool/nncss_all/64qam_3.net 

Step  2  Open  the  SPW  simulation  noodel  entitled  bp_eval(27).system. 
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Step  3  Turn  learning  off  by  changing  the  value  of  the  constant  feeding  into  the  train  input 
to  the  bpnet  block  to  3.0. 

Step  4  Run  the  simulation  for  30K  iterations. 

Step  5  Select  File-Gose  on  the  Simulation  Run  window. 

To  see  that  the  neural  network  can  indeed  be  trained  to  equalize  a  Multipath-Distorted  64- 
QAM  signal,  this  simulation  can  be  executed  in  a  non-ISL  mode  with  the  following 
modifications: 


Step  1  Make  sure  that  weights  are  initialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/spwsys^ool/nncss_all/64qam_3.net 

Step  2  If  it  is  not  already  on,  turn  learning  on  by  changing  the  value  of  the  constant  feeding 
into  the  train  input  to  the  bpnet  block  to  1.0. 


Step  3  To  expedite  the  simulation,  select  and  cut  all  of  the  ISL  blocks  in  the  ISL  Output 
portion  of  the  system  diagram. 


Step  4  In  the  Network  and  Output  Control  portion  of  the  system  diagram,  edit  the 
parameter  on  the  Unit  Step  block  which  feeds  into  the  Inverter  and  wait  connector. 
Change  its  value  to  950000.  This  will  cause  signal  files  to  be  written  to  disk  only 
after  simulation  iteration  950000  has  been  reached,  thus  writing  less  to  disk. 


Sr'p5 


Run  the  simulation  for  1 
complete. 


•Hill 


iterations.  This  may  take  about  90  minutes  to 


Step  6  After  completion,  press  SigCalc  on  the  Simulation  window  to  view  resulting 
signals. 


Step  7  Select  Hle-Gose  on  the  Simulation  Run  window. 

3.  Use  of  a  Backpropagation  Neural  Network  to  Control  a  Bank  of 
Equalizers  in  a  Dynamic  Multipath  Environment  (refer  to  4.3) 


This  demonstration  displays  the  results  of  a  trained  Backpropagation  Network  to  equalize  a 
Dynamic  Multipath-Distorted  64-QAM  signal  using  the  outputs  of  a  bank  of 
equalizers.  The  actual  training  of  the  network  is  time-consuming  and  is  not 
approfniate  for  an  ISL  demonstration.  To  execute  the  ISL  demonstration: 

Step  1  Since  this  demonstration  displays  results  for  a  previously  trained  network,  we  must 
begin  with  the  weights  which  resulted  ^rom  this  earlier  training:  cp 
/spwsys^poo]/ti>ill_3-allAninpath_l  .net  /spwsys/poo]/nncss_al]/mmpath_l  .net 

Step  2  Open  the  SPW  simulation  model  entitled  bp_mmtest(8).system. 

Step  3  Run  the  simulation  foe  EOF  iterations.  The  simulation  will  read  test  data  from  a 
file,  re-starting  frtHO  the  be^nning  when  the  end  is  reached.  When  satisfied  with 
the  ISL  display,  press  Abort  in  the  Simulation  Run  window. 

Step  4  Select  I^le-Gose  on  the  Simulation  Run  window. 
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4.  Demodulation  of  Non-Linearly  Distorted  16  QAM  Using 
Backpropagation  (refer  to  4.4) 

Step  1  Make  sure  that  weights  are  initialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/spwsys/pool/nncss_all/classify6.net 

Step  2  Open  the  SPW  simulation  model  endded  classify(16).system. 

Step  3  Run  the  simulation  for  2SK  iterations.  You  will  observe  the  error  rate  of  the  neural 
network  decrease  drastically  over  the  course  of  the  simulation  as  the  network  learns 
better  decision  regions,  and  eventually  performing  bener  than  the  Linear  Decision 
Regicms  for  Ideal  16  QAM. 

Step  S  Select  File-Close  on  the  Simulation  Run  window. 

5.  Demodulation  of  QPSK  over  a  Non-Linear,  Dispersive  Channel  using 
Backpropagation  Neural  Network  (refer  to  4.5) 

This  demonstration  displays  the  results  of  a  trained  Backpropagation  Network  to 
demodidate  a  QPSK  signal  which  has  been  transmitted  over  a  non-linear,  dispersive, 
AWGN  channel .  The  actual  training  of  the  network  is  time-consuming  and  is  not 
appropriate  for  an  ISL  demonstration.  To  execute  the  ISL  demonstration: 

Step  1  Since  this  demonstration  displays  results  for  a  previously  trained  network,  we  must 
begin  with  the  weights  which  resulted  from  this  earlier  training:  cp 
/spwsys;/^pooI/tbid_3_alVqpsk6.nct/spwsyS(^xx>l/nncss_all/qpsk6.net 

Step  2  Open  the  SPW  simulation  model  entided  bp_qpsk(13).system. 

Step  3  Make  sure  that  learning  is  off  by  changing  the  value  of  the  constant  feeding  into  the 
train  input  to  the  bpnet  block  to  0.0. 

Step  4  Run  the  simulation  for  lOK  iterations.  The  in-phase  and  quadrature  components 
produced  by  the  neural  netwoik  (shown  in  a  constellation  diagram)  are  thresholded 
to  imxluce  the  demodulated  QPSK  symbol. 

Step  S  Select  File-Qose  on  the  Simulation  Run  window. 

To  see  that  the  neural  network  can  indeed  be  trained  to  denoodulate  a  QPSK  signal  which 
has  been  transmitted  over  a  non-linear,  dispersive,  AWGN  channel ,  this  simulation  can  be 
executed  in  a  non-ISL  mode  with  the  following  modifications: 

Step  1  Make  sure  that  weights  are  initialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/spwsys^oolAmcss_all/qpsk6.net 

Step  2  If  it  is  not  already  on,  turn  learning  on  by  changing  the  value  of  the  constant  feeding 
into  the  trcun  input  to  the  bpnet  block  to  1.0. 

Step  3  To  expedite  the  simulation,  selea  and  cut  all  of  the  ISL  blocks  in  the  ISL  Output 
portion  of  the  system  diagram. 


IIS 


Step  4  In  the  Network  and  Output  Control  portion  of  the  system  diagram,  edit  the 
parameter  on  the  Unit  Step  block  which  feeds  into  the  Inverter  and  wait  connector. 
Change  its  value  to  9S0000.  This  will  cause  signal  files  to  be  written  to  disk  only 
after  simulation  iteration  950000  has  been  reached,  thus  writing  less  to  disk. 


Steps 


Run  the  simulation  for  1 
complete. 
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iterations.  This  may  take  about  several  hours  to 


Step  6  After  completion,  press  SigCalc  on  the  Simulation  window  to  view  resulting 
signals. 

Step  7  Select  File-Close  on  the  Simulation  Run  window. 

6.  Demodulation  of  QPSK  using  a  Configuration  of  Kohonen  and 
Outstar  Neural  Networks  (refer  to  4.6) 


Step  1  Make  sure  that  weights  are  initialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/spwsys^ooI/nncss_alVkoh3.net 

Step  2  Open  the  SPW  simulation  model  entitled  koh_qpsk(8).system. 

Step  3  The  Kohonen  and  Outstar  networks  each  are  controlled  by  a  Neural  Network  Object 
Controller  (NNOC)  block.  The  NNOC  for  the  Outstar  is  on  the  top  system  level, 
while  the  NNOC  for  the  Kohonen  Netwoik  is  inside  the  KTMNET  block.  The 
Delta  Learning  Threshold  parameter  inside  each  of  these  NNOC  blocks  should  be 
changed  to -1.0.  This  is  necessary  since  there  is  no  delta  input  to  the  NNOCs.  By 
definiticm,  if  the  delta  ^gnal  is  less  than  or  equal  to  the  value  of  the  Delta  Learning 
Threshold  parameter,  then  weights  will  no  longer  be  updated. 

Step  4  Run  die  simulation  for  20K  iterations. 

Step  5  Select  Hle-Qose  on  the  Simulation  Run  window. 

7.  Demodulation  of  Non-Linearly  Distorted  16  QAM  using  a 
ConHguration  of  Kohonen  and  Outstar  Neural  Networks  (refer  to 
4.7) 


Step  1  Make  sure  that  weights  are  initialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/spwsys^oo]/nncss_all/koh  1  ^am  1 3.net 

Step  2  Open  the  SPW  simulation  model  entitled  16qam_twt(18). 

Step  3  The  Ktdxmen  and  Outstar  networks  each  are  controlled  by  a  Neural  Netwoik  Objea 
Controller  (NNOC)  block.  The  NNOC  for  the  Outstar  is  on  the  top  system  level, 
while  the  NNOC  for  the  Kohonen  Network  is  inside  the  KTMNET  block.  The 
Delta  Learning  Threshold  parameter  inside  each  of  these  NNOC  blocks  should  be 
changed  to  -1.0.  This  is  necessary  since  there  is  no  delta  input  to  the  NNOCs.  By 
definition,  if  die  delta  signal  is  less  than  or  equal  to  the  value  of  the  Delta  Learning 
Threshold  parameter,  then  weights  wUl  no  longer  be  updated. 

Step  4  Run  die  simulation  for  3SK  iterations. 
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Step  5  Select  File-Qose  on  the  Simulation  Run  window. 

8.  Demodulation  of  16  QAM  over  a  Rayleigh  Channel  using  a 
Conflguration  of  Kohonen  and  Outstar  Neural  Networks  (refer  to 
4.8) 

Step  1  Make  sure  that  weights  are  initialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/spwsys^ool/nncss_all/mobile.net 

Step  2  Open  the  SPW  simuladon  model  entitled  mobile(15). 

Step  3  The  Kohonen  and  Outstar  networks  each  are  controlled  by  a  Neural  Network  Objea 
Controller  (NNOQ  block.  The  NNOC  for  the  (^tstar  is  on  the  top  level  inside  of 
the  Kohonen  EquaUzer  block,  while  the  NNOC  for  the  Kohonen  Netwo^  is  inside 
the  KTMNET  block,  which  is  also  inside  of  the  Kohonen  Equalizer  block.  The 
Delta  Learning  Threshold  parameter  inside  each  of  these  NNOC  blocks  should  ^ 
changed  to  -1.0.  This  is  necessary  since  there  is  no  delta  input  to  the  NNOCs.  By 
defirution,  if  the  delta  signal  is  less  than  or  equal  to  the  value  of  tiie  Delta  Learning 
Threshold  parameter,  then  weights  will  no  longer  be  updated. 

Step  4  Run  the  simulation  for  3SK  iterations. 

Step  S  This  demonstration  contains  many  controls  which  make  it  rather  complicated. 
There  are  3  Eye  and  Scatter  diagrams  which  appear.  In  each  case,  the  Eye  pattern 
pr^uced  is  to  be  ignored.  The  Eye  and  Scatter  blocks  were  used  simply  for  their 
built-in  controls.  The  leftmost  scatter  plot  is  the  received  constellation  from  the 
Rayleigh  channel.  Over  the  course  of  the  demonstration,  it  will  contract,  expand, 
and  rotate.  As  it  contracts  and  expands,  it  will  become  necessary  to  adjust  the  Gain 
via  the  scroll  bar  to  keep  its  display  within  the  bounds  of  the  scatter  window.  The 
center  scatter  plot  is  the  resulting  constellation  produced  by  a  stand-alone  Linear 
Adaptive  Equsdizer.  Given  a  training  signal,  the  equalizer  can  adjust  to  the  Rayleigh 
effects.  As  the  Rayleigh  fades  occur,  the  equalizer  will  lose  track  of  the 
rotated/ccxnpressed/expanded  constellation  and  again  will  require  a  training  signal  to 
re-adjust. 

The  rightmost  scatter  plot  is  a  plot  of  the  Kohonen  weights.  These  weights  will  converge 
to  the  centers  of  the  clusters  formed  by  the  equalizer  portion  of  the  Kohonen  Equalizer. 
They  will  individually  move  in  order  to  quickly  track  the  rotating/compressing/expanding 
ccmstellation. 

Below  the  Eye  and  Scatter  controls  are  several  pushbuttons  which  control  training  and 
learning  for  the  stand-alone  Linear  Adaptive  Equalizer  and  the  Kohonen  Equalizer. 

Train  KTM  -  Toggles  wei^t  adjustment  for  Kohonen  portion  of  the  Kohonen  Equalizer, 
whether  in  normal  or  adaptive  mode. 

Train  Equalizer  -  Toggles  the  presentation  of  the  actual  transmitted  16  QAM  signal  as  a 
training  signal  to  the  stand-alcme  Linear  Adaptive  Equalizer. 

Train  Outstar  -  Toggles  the  presentation  of  the  actual  transmitted  16  QAM  signal  as  a 
training  signal  to  the  Outstar  Network. 
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Use  Training  Sig  -  Toggles  the  presentation  of  the  actual  transmitted  16  QAM  signal  as  a 
training  sigii^  to  the  equalizer  portion  of  the  Kohonen  Equalizer. 

Use  Conventional  Decision  Regions  -  In  the  Kohonen  Equalizer,  an  error  signal  is  fed  back 
update  the  weights  of  the  equalizer  portion.  This  pushbutton  controls  the  method  of 
creating  this  error  signal.  If  the  button  is  pressed,  the  error  signal  is  the  difference  between 
the  closest  Ideal  QAM  symbol  and  the  equalizer  output  If  the  button  is  not  pressed,  the 
OTor  signal  is  the  difference  between  the  weights  corresponding  to  the  winning  Kohonen 
node  and  the  equalizer  output 

Re-Train  KTM  -  If  a  severe  null  occurs,  the  Kohonen  Equalizer  will  be  unable  to  track  the 
moving  symbol  constellation,  and  will  require  re-training.  When  this  button  is  pressed, 
retraining  will  occur.  Note  that  re-training  the  Kohonen  network  will  require  a  re-mapping 
(and  hence  re-training)  of  the  Outstar  Network. 

At  the  bottom  of  the  display  are  two  bar  graphs  which  display  symbol  errors.  The  left 
graph  corresponds  to  symbol  errors  made  by  the  Linear  Adaptive  Equalizer,  and  the  right 
for  the  Kt^onen  Equalizer.  Note  that  when  a  training  signal  is  given  to  either  equalizer,  the 
corresponding  symbol  error  graph  is  disabled  and  will  show  a  zero  value. 

Onct  the  ISL  Window  appears  on  the  screen,  first  set  the  Scatter  Persistence  scroll  bar  on 
the  rightmost  scatter  plot  to  16,  since  we  are  interested  in  the  weights  corresponding  to  the 
16  Kohonen  nodes.  Initially,  all  buttons  except  for  the  Re-Train  KTM  button  should  be 
depressed.  Around  simulation  iteration  5000,  tiie  Kohonen  network  enters  adaptive  mode. 
At  this  time,  the  Kohonen  weights  have  found  the  symbol  centers  of  the  equalized 
oxistellation  and  are  tracking  movement  The  stand-alone  equalizer  too  has  b^n  trained  to 
equalize  the  received  signal  Train  Equalizer  and  Use  Training  Signal  may  be  tumed-off. 
Over  the  next  6000  simulation  iterations,  the  Outstar  Network  map  each  Kdionen  node 
to  an  explicit  16  QAM  symbol  The  Outstar  training  may  be  accelerated  by  increasing  the 
learn  and  decay  rates.  So  at  around  iteration  IIK,  Train  Outstar  may  be  turned  off. 
Depending  upon  the  noise  seed  and  the  Doppler  of  the  Rayleigh  channel  (the  scrollbar 
entitled  Amplitude  on  the  display),  the  Kohonen  Equalizer  and  the  stand-alone  equalizer 
will  correctiy  demodulate  the  received  signal,  as  evidenced  by  the  two  bar  graphs  at  the 
bottom  of  the  display.  Usually,  the  stand-alone  equalizer  will  fail  first  during  appearance 
of  nulls.  Failing  is  indicated  by  a  large  number  of  spikes  (corresponding  to  symbol  errors) 
in  the  bar  graphs. 

Once  either  equalizer  has  failed,  a  training  sequence  is  required  for  proper  demodulation  to 
resume.  If  the  stand-alone  equalizer  fails,  press  the  Trcun  Equalizer  button.  The  equalizer 
will  r^uire  some  time  to  re-adjust  its  weights.  Once  the  center  constellation  appears  fairly 
organized,  the  Trcun  Equalizer  button  may  be  turned  off.  Note  that  when  a  training  signd 
is  given  to  either  equalizer,  the  corresponc^g  symbol  error  graph  is  disabled  and  will  show 
a  zero  value.  If  the  Kohonen  Equalizer  fails,  the  KTM  must  be  redone,  and  hence  the 
Outstar  mapping.  This  is  accomplished  by  depressing  Train  KTM,  Train  Outstar,  Use 
Training  Sig,  Use  Conventional  Decision  Regions,  and  Re-Train  KTM.  After  several 
thousand  iterations,  the  KTM  is  ready  to  resume  adaptive  mode.  Toggle  Use  Training  Sig, 
Use  Conventional  Decision  Regions,  and  Re-Train  KTM.  The  Kohonen  weight  should 
now  appear  at  the  16  QAM  cluster  centers.  The  Outstar  may  then  be  trained.  About  8(X)0 
iterations  later.  Train  Outstar  may  be  turned  off. 

Step  6  Select  File-Qose  on  the  Simulation  Run  window. 
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9.  Improving  Viterbi  Decoder  Soft  Decisions  in  a  Pulse  Jamming 
Environment  (refer  to  4.9) 

Step  1  We  initialize  the  network  weights  to  values  previously  determined  by  training  the 
network  to  learn  an  identity  mapping.  Thus  the  initial  transfer  function  for  the 
neural  network  will  be  linear  over  a  range  of  signal  values  typical  of  a  no-jammer 
scenario.  This  transfer  function  is  similar  to  a  conventional  soft  decision  metric.  To 
do  this:  cp  /spwsys/pool/tbill_3_all/viterbi_id_10nodes 
/spwsys/pool/nncss_allMterbi3.net 

Step  2  Open  the  SPW  simulation  model  entitled  bp_viterbi(24).system. 

Step  3  Run  the  simulation  for  70K  iterations.  You  will  observe  the  neural  network  soft 
decision  metric  adapt  from  its  initial  form  to  a  form  similar  to  the  theoretical 
optimum. 

Step  5  Select  File-Qose  on  the  Simulation  Run  window. 

10.  Demodulation  of  QPSK  Over  a  Non*Linear,  Dispersive  Channel 
Using  a  Fully  Recurrent  Network  (refer  to  4.10) 


This  demonstration  displays  the  results  of  a  trained  Recurrent  Network  to  demodulate  a 
QPSK  signal  which  has  b^n  transmitted  over  a  non-linear,  dispersive,  AWGN  channel . 
llie  actual  training  of  the  network  is  time-consuming  and  is  not  appropriate  for  an  ISL 
demcHistration.  To  execute  the  ISL  demonstration: 

Step  1  Since  this  demonstration  displays  results  for  a  previously  trained  network,  we  must 
begin  with  the  weights  which  resulted  from  this  earlier  training:  cp 
/spwsys/poolAbill_3-S^qpsk6new.net/spwsys/pool/nncss_all/qpsk6new.net 

Step  2  (Jpen  the  SPW  simulation  model  entitle  rc_qpsk(3).system. 

Step  3  Make  sure  that  learning  znd  forced  is  off  by  changing  the  value  of  the  constant 
feeding  into  the  train  znd  forced  input  to  the  tenet  block  to  0.0. 

Step  4  Run  the  simulation  for  20K  iterations.  The  in-phase  and  quadrature  components 
produced  by  the  neural  network  (shown  in  a  constellation  diagram)  are  thresholded 
to  produce  the  demodulated  QPSK  symbol. 

Step  S  Select  File-Gose  on  the  Simulation  Run  window. 

To  see  that  the  neural  networic  can  indeed  be  trained  to  demodulate  a  QPSK  signal  which 
has  been  transmitted  over  a  non-linear,  dispexsive,  AWGN  channel ,  this  simulation  can  be 
executed  in  a  ntxi-ISL  mode  with  die  following  modifications: 

Step  1  Make  sure  that  weights  are  initialized  from  random  values  instead  of  from  the 
stopping  point  of  a  previous  execution  of  this  system:  rm 

/^wsys/pool/nncss_all/qpsk6new.net 

Step  2  If  it  is  not  already  on,  turn  learning  on  and  forced  on  by  changing  the  value  of  the 
constant  feeding  into  the  train  input  and  forced  input  to  ^e  renet  block  to  1.0. 
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Step  3  To  expedite  the  simulation,  select  and  cut  all  of  the  ISL  blocks  in  the  ISL  Output 
portion  of  the  system  diagram. 

Step  4  In  the  Network  and  Output  Control  portion  of  the  system  diagram,  edit  the 
parameter  on  the  Unit  Step  block  which  feeds  into  the  Inverter  and  wait  cotmector. 
Change  its  value  to  950000.  This  will  cause  signal  files  to  be  written  to  disk  only 
after  simulation  iteration  950000  has  been  reached,  thus  writing  less  to  disk. 

Step  5  Run  the  simulation  for  1000000  iterations.  This  may  take  about  several  hours  to 
complete. 

Step  6  The  Recurrent  Network  has  been  trained  with  forced  learning.  It  now  requu-es 
further  training  with  unforced  learning.  Turn  learning  on  and  forced  off. 

Step  7  In  the  Network  and  Output  Control  portion  of  the  system  diagram,  edit  the 
parameter  on  the  Unit  Step  block  which  feeds  into  the  Inverter  and  w(ut  connector. 
Change  its  value  to  950000.  This  wUl  cause  signal  Hies  to  be  written  to  disk  only 
after  simulation  iteration  950000  has  been  reached,  thus  writing  less  to  disk. 

Step  8  Run  the  simulation  for  1000000  iterations.  This  may  take  about  several  hours  to 
ccxnplete. 

Step  9  After  completion,  press  SigCalc  on  the  Simulation  window  to  view  resulting 
signals. 

Step  10  Select  File-Close  on  the  Simulation  Run  window. 
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MISSION 

OF 

ROME  LABORATORY 


Mission.  The  mission  of  Rome  Laboratory  is  to  advance  the  science  and 
technologies  of  command,  control,  communications  and  intelligence  and  to 
transition  them  into  systems  to  meet  customer  needs.  To  achieve  this, 
Rome  Lab: 


a.  Conducts  vigorous  research,  development  and  test  programs  in  ail 
applicable  technologies; 

b.  Transitions  technology  to  current  and  future  systems  to  improve 
operational  capability,  readiness,  and  supportabiiity; 

c.  Provides  a  full  range  of  technical  support  to  Air  Force  Materiel 
Command  product  centers  and  other  Air  Force  organizations; 

d.  Promotes  transfer  of  technology  to  the  private  sector; 

e.  Maintains  leading  edge  technological  expertise  in  the  areas  of 
surveillance,  communications,  command  and  control,  intelligence,  reliability 
science,  electro-magnetic  technology,  photonics,  signal  processing,  and 
computational  science. 


The  thrust  areas  of  technical  competence  include:  Surveillance. 
Communications,  Command  and  Control,  Intelligence,  Signal  Processing. 
Computer  Science  and  Technology,  Electromagnetic  Technoiogy, 
Photonics  and  Reliability  Sciences. 


